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PREFACE 


In today’s world, analytical thinking is a critical part of any solid education. An im- 
portant segment of this kind of reasoning—one that cuts across many disciplines—is 
discrete mathematics. Discrete math concerns counting, probability, (sophisticated 
forms of) addition, and limit processes over discrete sets. Combinatorics, graph 
theory, the idea of function, recurrence relations, permutations, and set theory are 
all part of discrete math. Sequences and series are among the most important ap- 
plications of these ideas. 

Discrete mathematics is an essential part of the foundations of (theoretical) 
computer science, statistics, probability theory, and algebra. The ideas come up 
repeatedly in different parts of calculus. Many would argue that discrete math is 
the most important component of all modern mathematical thought. 

Most basic math courses (at the freshman and sophomore level) are oriented 
toward problem-solving. Students can rely heavily on the provided examples as a 
crutch to learn the basic techniques and pass the exams. Discrete mathematics is, by 
contrast, rather theoretical. It involves proofs and ideas and abstraction. Freshman 
and sophomores in college these days have little experience with theory or with 
abstract thinking. They simply are not intellectually prepared for such material. 

Steven G. Krantz is an award-winning teacher, author of the book How to Teach 
Mathematics. He knows how to present mathematical ideas in a concrete fashion 
that students can absorb and master in a comfortable fashion. He can explain even 
abstract concepts in a hands-on fashion, making the learning process natural and 
fluid. Examples can be made tactile and real, thus helping students to finesse abstract 
technicalities. This book will serve as an ideal supplement to any standard text. It 
will help students over the traditional “hump” that the first theoretical math course 
constitutes. It will make the course palatable. Krantz has already authored two 
successful Demystified books. 

The good news is that discrete math, particularly sequences and series, can 
be illustrated with concrete examples from the real world. They can be made to 
be realistic and approachable. Thus the rather difficult set of ideas can be made 
accessible to a broad audience of students. For today’s audience—consisting not 
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only of mathematics students but of engineers, physicists, premedical students, 
social scientists, and others—this feature is especially important. 

A typical audience for this book will be freshman and sophomore students in the 
mathematical sciences, in engineering, in physics, and in any field where analytical 
thinking will play a role. Today premedical students, nursing students, business 
students, and many others take some version of calculus or discrete math or both. 
They will definitely need help with these theoretical topics. 

This text has several key features that make it unique and useful: 


1. The book makes abstract ideas concrete. All concepts are presented succinctly 
and clearly. 


2. Real-world examples illustrate ideas and make them accessible. 


3. Applications and examples come from real, believable contexts that are 
familiar and meaningful. 


4. Exercises develop both routine and analytical thinking skills. 


5. The book relates discrete math ideas to other parts of mathematics and 
science. 


Discrete Mathematics Demystified explains this panorama of ideas in a step-by- 
step and accessible manner. The author, a renowned teacher and expositor, has a 
strong sense of the level of the students who will read this book, their backgrounds 
and their strengths, and can present the material in accessible morsels that the student 
can study on his or her own. Well-chosen examples and cognate exercises will 
reinforce the ideas being presented. Frequent review, assessment, and application 
of the ideas will help students to retain and to internalize all the important concepts 
of calculus. 

Discrete Mathematics Demystified will be a valuable addition to the self-help 
literature. Written by an accomplished and experienced teacher, this book will also 
aid the student who is working without a teacher. It will provide encouragement and 
reinforcement as needed, and diagnostic exercises will help the student to measure 
his or her progress. 


CHAPTER 1 





Strictly speaking, our approach to logic is “intuitive” or “naïve.” Whereas in 
ordinary conversation these emotion-charged words may be used to downgrade 
the value of that which is being described, our use of these words is more technical. 
What is meant is that we shall prescribe in this chapter certain rules of logic, which 
are to be followed in the rest of the book. They will be presented to you in such a 
way that their validity should be intuitively appealing and self-evident. We cannot 
prove these rules. The rules of logic are the point where our learning begins. A 
more advanced course in logic will explore other logical methods. The ones that 
we present here are universally accepted in mathematics and in most of science and 
analytical thought. 

We shall begin with sentential logic and elementary connectives. This material is 
called the propositional calculus (to distinguish it from the predicate calculus, which 
will be treated later). In other words, we shall be discussing propositions—which 
are built up from atomic statements and connectives. The elementary connectives 
include “and,” “or,” “not,” “if-then,” and “if and only if.” Each of these will have a 
precise meaning and will have exact relationships with the other connectives. 

An elementary statement (or atomic statement) is a sentence with a subject and 
a verb (and sometimes an object) but no connectives (and, or, not, if then, if, and 
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only if). For example, 


John is good 
Mary has bread 
Ethel reads books 


are all atomic statements. We build up sentences, or propositions, from atomic 
statements using connectives. 

Next we shall consider the quantifiers “for all” and “there exists” and their 
relationships with the connectives from the last paragraph. The quantifiers will 
give rise to the so-called predicate calculus. Connectives and quantifiers will prove 
to be the building blocks of all future statements in this book, indeed in all of 
mathematics. 


1.1 Sentential Logic 


In everyday conversation, people sometimes argue about whether a statement is true 

or not. In mathematics there is nothing to argue about. In practice a sensible state- 

ment in mathematics is either true or false, and there is no room for opinion about 

this attribute. How do we determine which statements are true and which are false? 
The modern methodology in mathematics works as follows: 


e We define certain terms. 


e We assume that these terms, or statements about them, have certain properties 
or truth attributes (these assumptions are called axioms). 


e We specify certain rules of logic. 


Any statement that can be derived from the axioms, using the rules of logic, is 
understood to be true. It is not necessarily the case that every true statement can be 
derived in this fashion. However, in practice this is our method for verifying that a 
statement is true. 

On the other hand, a statement is false if it is inconsistent with the axioms and 
the rules of logic. That is to say, a statement is false if the assumption that it is true 
leads to a contradiction. Alternatively, a statement P is false if the negation of P can 
be established or proved. While it is possible for a statement to be false without our 
being able to derive a contradiction in this fashion, in practice we establish falsity 
by the method of contradiction or by giving a counterexample (which is another 
aspect of the method of contradiction). 
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The point of view being described here is special to mathematics. While it is 
indeed true that mathematics is used to model the world around us—in physics, 
engineering, and in other sciences—the subject of mathematics itself is a man-made 
system. Its internal coherence is guaranteed by the axiomatic method that we have 
just described. 

Itis worth mentioning that “truth” in everyday life is treated differently. When you 
tell someone “I love you” and you are asked for proof, a mathematical verification 
will not do the job. You will offer empirical evidence of your caring, your fealty, 
your monogamy, and so forth. But you cannot give a mathematical proof. In a court 
of law, when an attorney “proves” a case, he/she does so by offering evidence and 
arguing from that evidence. The attorney cannot offer a mathematical argument. 

The way that we reason in mathematics is special, but it is ideally suited to the 
task that we must perform. It is a means of rigorously manipulating ideas to arrive 
at new truths. It is a methodology that has stood the test of time for thousands of 
years, and that guarantees that our ideas will travel well and apply to a great variety 
of situations and applications. 

It is reasonable to ask whether mathematical truth is a construct of the human 
mind or an immutable part of nature. For instance, is the assertion that “the area of 
a circle is m times the radius squared” actually a fact of nature just like Newton’s 
inverse square law of gravitation? Our point of view is that mathematical truth is 
relative. The formula for the area of a circle is a logical consequence of the axioms 
of mathematics, nothing more. The fact that the formula seems to describe what 
is going on in nature is convenient, and is part of what makes mathematics useful. 
But that aspect is something over which we as mathematicians have no control. Our 
concern is with the internal coherence of our logical system. 

It can be asserted that a “proof” (a concept to be discussed later in the book) is a 
psychological device for convincing the reader that an assertion is true. However, our 
view in this book is more rigid: a proof of an assertion is a sequence of applications 
of the rules of logic to derive the assertion from the axioms. There is no room for 
opinion here. The axioms are plain. The rules are rigid. A proof is like a sequence 
of moves in a game of chess. If the rules are followed then the proof is correct. 
Otherwise not. 


1.2 “And” and “Or” 


Let A be the statement “Arnold is old.” and B be the statement “Arnold is fat.” The 
new statement 


“A and B” 
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means that both A is true and B is true. Thus 
Arnold is old and Arnold is fat 


means both that Arnold is old and Arnold is fat. If we meet Arnold and he turns out 
to be young and fat, then the statement is false. If he is old and thin then the statement 
is false. Finally, if Arnold is both young and thin then the statement is false. The 
statement is true precisely when both properties—oldness and fatness—hold. We 
may summarize these assertions with a truth table. We let 


A = Arnold is old 
and 
B = Arnold is fat 
The expression 
AAB 
will denote the phrase “A and B.” We call this statement the conjunction of A and 


B. The letters “T” and “F” denote “True” and “False” respectively. Then we have 


AAB 


MomHH| a 
siie iip s ii i- 
Mmm! > 


Notice that we have listed all possible truth values of A and B and the corresponding 

values of the conjunction A A^ B. The conjunction is true only when both A and B are 

true. Otherwise it is false. This property is a special feature of conjunction, or “and.” 
In a restaurant the menu often contains phrases such as 


soup or salad 


This means that we may select soup or select salad, but we may not select both. 
This use of the word “or” is called the exclusive “or”; it is not the meaning of “or” 
that we use in mathematics and logic. In mathematics we instead say that “A or 
B” is true provided that A is true or B is true or both are true. This is the inclusive 
“or.” If we let A V B denote “A or B” then the truth table is 
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We call the statement A V B the disjunction of A and B. Note that this disjunction 
is true in three out of four cases: the only time the disjunction is false is if both 
components are false. 

The reason that we use the inclusive form of “or” in mathematics is that this 
form of “or” has a nice relationship with “and,” as we shall see below. The other 
form of “or” does not. 

We see from the truth table that the only way that “A or B” can be false is if both 
A is false and B is false. For instance, the statement 


Hilary is beautiful or Hilary is poor 


means that Hilary is either beautiful or poor or both. In particular, she will not 
be both ugly and rich. Another way of saying this is that if she is ugly she will 
compensate by being poor; if she is rich she will compensate by being beautiful. 
But she could be both beautiful and poor. 


EXAMPLE 1.1 
The statement 


x>2 and x<5 
is true for the number x = 3 because this value of x is both greater than 2 and less 


than 5. It is false for x = 6 because this x value is greater than 2 but not less than 
5. It is false for x = 1 because this x is less than 5 but not greater than 2. 














EXAMPLE 1.2 
The statement 


x is odd and x is a perfect cube 
is true for x = 27 because both assertions hold. It is false for x = 7 because this x, 


while odd, is not a cube. It is false for x = 8 because this x, while a cube, is not 
odd. It is false for x = 10 because this x is neither odd nor is it a cube. 
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EXAMPLE 1.3 
The statement 


x<3 or x>6 


is true for x = 2 since this x is < 3 (even though it is not > 6). It holds (that is, it 
is true) for x = 9 because this x is > 6 (even though it is not < 3). The statement 
fails (that is, it is false) for x = 4 since this x is neither < 3 nor > 6. 














EXAMPLE 1.4 
The statement 


x>1 or x <4 


is true for every real x. As an exercise, you should provide a detailed reason for 
this answer. (Hint: Consider separately the casesx <1,x =1,1<x<4,x =4, 
and x > 4.) 














EXAMPLE 1.5 
The statement (A V B) ^ B has the following truth table: 


AvB (AVB)AB 





B 
T 
F 
T 
F 


JHag] > 
Madina 
Mans 














You Try It: Construct a truth table for the statement 
The number x is positive and is a perfect square 


Notice in Example 1.5 that the statement (A v B) ^ B has the same truth values 
as the simpler statement B. In what follows, we shall call such pairs of statements 
(having the same truth values) logically equivalent. 

The words “and” and “or” are called connectives; their role in sentential logic 
is to enable us to build up (or to connect together) pairs of statements. The idea is 
to use very simple statements, like “Jennifer is swift.” as building blocks; then we 
compose more complex statements from these building blocks by using connectives. 
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In the next two sections we will become acquainted with the other two basic 
connectives “not” and “if-then.” We shall also say a little bit about the compound 
connective “if and only if.” 


1.3 “Not” 


The statement “not A,” written ~ A, is true whenever A is false. For example, the 
statement 


Charles is not happily married 


is true provided the statement “Charles is happily married” is false. The truth table 
for ~ A is as follows: 


Greater understanding is obtained by combining the connectives: 


EXAMPLE 1.6 
We examine the truth table for ~ (A A B): 





A B AAB ~ (A A B) 
TT T F 
T F F T 
FT F T 
FF F T 














EXAMPLE 1.7 
Now we look at the truth table for (~ A) v (~ B): 


~A ~B (~A)V(~B) 





MomHH| a 
iz iie iie s iee E i- 
e Me ie s igs) 
o ies ii igs 
e Me Me igs 
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Notice that the statements ~ (A A B) and (~ A) v (~ B) have the same truth 
table. As previously noted, such pairs of statements are called logically equivalent. 

The logical equivalence of ~ (A A B) with (~ A) v (~ B) makes good intuitive 
sense: the statement A ^ B fails [that is, ~ (A A B) is true] precisely when either A 
is false or B is false. That is, (~ A) v (~ B). Since in mathematics we cannot rely 
on our intuition to establish facts, it is important to have the truth table technique 
for establishing logical equivalence. The exercise set will give you further practice 
with this notion. 

One of the main reasons that we use the inclusive definition of “or” rather than the 
exclusive one is so that the connectives “and” and “or” have the nice relationship just 
discussed. It is also the case that ~ (A v B) and (~ A) A (~ B) are logically equiv- 
alent. These logical equivalences are sometimes referred to as de Morgan’s laws. 


1.4 “If-Then” 


A statement of the form “If A then B” asserts that whenever A is true then B is 
also true. This assertion (or “promise’’) is tested when A is true, because it is then 
claimed that something else (namely B) is true as well. However, when A is false 
then the statement “If A then B” claims nothing. Using the symbols A > B to 
denote “If A then B”, we obtain the following truth table: 


MomHH| } 
zo iee iie. iia i ii- 
Jamaj} 


Notice that we use here an important principle of aristotelian logic: every sen- 
sible statement is either true or false. There is no “in between” status. When A is 
false we can hardly assert that A = B is false. For A => B asserts that “whenever 
A is true then B is true”, and A is not true! 

Put in other words, when A is false then the statement A > B is not tested. It 
therefore cannot be false. So it must be true. We refer to A as the hypothesis of 
the implication and to B as the conclusion of the implication. When the if-then 
statement is true, then the hypothsis implies the conclusion. 


EXAMPLE 1.8 
The statement “If 2 = 4 then Calvin Coolidge was our greatest president” is true. 
This is the case no matter what you think of Calvin Coolidge. The point is that 
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the hypothesis (2 = 4) is false; thus it doesn’t matter what the truth value of the 
conclusion is. According to the truth table for implication, the sentence is true. 

The statement “If fish have hair then chickens have lips” is true. Again, the 
hypothesis is false so the sentence is true. 

The statement “If 9 > 5 then dogs don’t fly” is true. In this case the hypothesis 
is certainly true and so is the conclusion. Therefore the sentence is true. 

(Notice that the “if” part of the sentence and the “then” part of the sentence need 
not be related in any intuitive sense. The truth or falsity of an “if-then” statement 
is simply a fact about the logical values of its hypothesis and of its conclusion.) 














EXAMPLE 1.9 
The statement A > B is logically equivalent with (~ A) v B. For the truth table 
for the latter is 





A B ~A (~A)VB 
TT F T 
T F F F 
FT T T 
FF T T 











which is the same as the truth table for A > B. 





You should think for a bit to see that (~ A) v B says the same thing as A > B. 
To wit, assume that the statement (~ A) V B is true. Now suppose that A is true. It 
follows that ~ A is false. Then, according to the disjunction, B must be true. But 
that says that A > B. For the converse, assume that A > B is true. This means that 
if A holds then B must follow. But that just says (~ A) v B. So the two statements 
are equivalent, that is, they say the same thing. 

Once you believe that assertion, then the truth table for (~ A) v B gives us 
another way to understand the truth table for A > B.! 

There are in fact infinitely many pairs of logically equivalent statements. But just 
a few of these equivalences are really important in practice—most others are built 
up from these few basic ones. Some of the other basic pairs of logically equivalent 
statements are explored in the exercises. 


EXAMPLE 1.10 
The statement 


If x is negative then — 5 - x is positive 





'Once again, this logical equivalence illustrates the usefulness of the inclusive version of “or.” 
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is true. For if x < 0 then —5- x is indeed > 0; if x > 0 then the statement is 
unchallenged. 














EXAMPLE 1.11 
The statement 


If (x > Oand x? < 0) then x > 10 











is true since the hypothesis “x > 0 and x? < 0” is never true. 





EXAMPLE 1.12 
The statement 


If x > 0 then (x? < 0 or 2x < 0) 


is false since the conclusion “x? < 0 or 2x < 0” is false whenever the hypothesis 
x > 0 is true. 














EXAMPLE 1.13 
Let us construct a truth table for the statement [A v (~ B)] => [(~ A) AB]. 





A B ~A ~B [AV(\B)]_ [(~ A)AB] 
TT F F T F 
T F F T T F 
FT T F F T 
FF T T T F 


[Av (~ B)] > [~ A) AB] 





F 


F 
T 
F 














Notice that the statement [A v (~ B)] > [(~ A) A B] has the same truth table as 
~ (B = A). Can you comment on the logical equivalence of these two statements? 
Perhaps the most commonly used logical syllogism is the following. Suppose 
that we know the truth of A and of A = B. We wish to conclude B. Examine the 
truth table for A = B. The only line in which both A is true and A = B is true 
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is the line in which B is true. That justifies our reasoning. In logic texts, the syl- 
logism we are discussing is known as modus ponendo ponens or, more briefly, 
modus ponens. 


EXAMPLE 1.14 
Consider the two statements 


It is cloudy 
and 
If it is cloudy then it is raining 


We think of the first of these as A and the second as A = B. From these two taken 
together we may conclude B, or 





It is raining 











EXAMPLE 1.15 
The statement 


Every yellow dog has fleas 
together with the statement 
Fido is a blue dog 


allows no logical conclusion. The first statement has the form A = B but the second 
statement is not A. So modus ponendo ponens does not apply. 














EXAMPLE 1.16 
Consider the two statements 


All Martians eat breakfast 
and 
My friend Jim eats breakfast 
It is quite common, in casual conversation, for people to abuse logic and to conclude 


that Jim must be a Martian. Of course this is an incorrect application of modus 
ponendo ponens. In fact no conclusion is possible. 
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1.5 Contrapositive, Converse, and “Iff” 


The statement 


If A then B 
is the same as 
A=>B 
or 
A suffices for B 
or as saying 
A only if B 


All these forms are encountered in practice, and you should think about them long 
enough to realize that they say the same thing. 
On the other hand, 


If B then A 
is the same as saying 
B>A 
or 
A is necessary for B 
or as saying 
AifB 


We call the statement B = A the converse of A = B. The converse of an implication 
is logically distinct from the implication itself. Generally speaking, the converse 
will not be logically equivalent to the original implication. The next two examples 
illustrate the point. 


EXAMPLE 1.17 
The converse of the statement 


If x is a healthy horse then x has four legs 
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is the statement 
If x has four legs then x is a healthy horse 


Notice that these statements have very different meanings: the first statement is true 
while the second (its converse) is false. For instance, a chair has four legs but it is 
not a healthy horse. Likewise for a pig. 














EXAMPLE 1.18 
The statement 


If x > 5 then x > 3 
is true. Any number that is greater than 5 is certainly greater than 3. But the converse 
If x > 3 then x > 5 


is certainly false. Take x = 4. Then the hypothesis is true but the conclusion is 
false. 














The statement 
A if and only if B 
is a brief way of saying 
If A then B and If B then A 


We abbreviate A if and only if B as A < B or as A iff B. Now we look at a truth 
table for A > B. 





A B A>B BSA ASB 
T T T T T 
T F F T F 
F T T F F 
F F T T T 


Notice that we can say that A > B is true only when both A > B and B > A are 
true. An examination of the truth table reveals that A } B is true precisely when 
A and B are either both true or both false. Thus A <+ B means precisely that A and 
B are logically equivalent. One is true when and only when the other is true. 
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EXAMPLE 1.19 
The statement 


x>0<s2x>0 











is true. For if x > 0 then 2x > 0; and if 2x > 0 then x > 0. 





EXAMPLE 1.20 
The statement 


x>08x 50 


is false. For x > 0 > x? > 0 is certainly true while x? > 0 > x > 0 is false 
[(—3)? > 0 but —3 > 0]. 














EXAMPLE 1.21 
The statement 


[~ (AV B) $ [~ A) A (~ B)] (1.1) 


is true because the truth table for ~(A v B) and that for (~ A) ^ (~ B) are the 
same. Thus they are logically equivalent: one statement is true precisely when the 
other is. Another way to see the truth of Eq. (1.1) is to examine the truth table: 


~(AvB) (~AVACB) [~ AVB] S [~~ ADA (~~ B)] 





B 
T 
F 
T 
F 


MmaHH| > 
sma 
Suma 
e Mee iar iiae 














Given an implication 
A>B 
the contrapositive statement is defined to be the implication 
~B>nA 


The contrapositive (unlike the converse) is logically equivalent to the original im- 
plication, as we see by examining their truth tables: 


CHAPTER 1 Logic 15 





A B A>B 

T T T 

T F F 

F T T 

F F T 

and 

A B ~A ~B (~B)> (~A) 
T T F F T 
T F F T F 
F T T F T 
F F T T T 


EXAMPLE 1.22 
The statement 


If it is raining, then it is cloudy 
has, as its contrapositive, the statement 
If there are no clouds, then it is not raining 


A moment’s thought convinces us that these two statements say the same thing: if 
there are no clouds, then it could not be raining; for the presence of rain implies the 
presence of clouds. 














The main point to keep in mind is that, given an implication A => B, its converse 
B > A and its contrapositive (~ B) = (~ A) are entirely different statements. 
The converse is distinct from, and logically independent from, the original state- 
ment. The contrapositive is distinct from, but logically equivalent to, the original 
statement. 

Some classical treatments augment the concept of modus ponens with the idea of 
modus tollendo tollens or modus tollens. It is in fact logically equivalent to modus 
ponens. Modus tollens says 


If~B and A= Bthen ~ A 
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Modus tollens actualizes the fact that ~ B = ~ A is logically equivalent to 
A = B. The first of these implications is of course the contrapositive of the second. 


1.6 Quantifiers 


The mathematical statements that we will encounter in practice will use the con- 
nectives “and,” “or,” “not,” “if-then,” and “iff.” They will also use quantifiers. The 
two basic quantifiers are “for all” and “there exists”. 


EXAMPLE 1.23 
Consider the statement 


All automobiles have wheels 


This statement makes an assertion about all automobiles. It is true, because every 
automobile does have wheels. 
Compare this statement with the next one: 


There exists a woman who is blonde 


This statement is of a different nature. It does not claim that all women have blonde 
hair—merely that there exists at least one woman who does. Since that is true, the 
statement is true. 














EXAMPLE 1.24 
Consider the statement 


All positive real numbers are integers 


This sentence asserts that something is true for all positive real numbers. It is indeed 

true for some positive numbers, such as 1 and 2 and 193. However, it is false for at 

least one positive number (such as 1/10 or 7), so the entire statement is false. 
Here is a more extreme example: 


The square of any real number is positive 


This assertion is almost true—the only exception is the real number 0: 07 = 0 is 
not positive. But it only takes one exception to falsify a “for all” statement. So the 
assertion is false. 

This last example illustrates the principle that the negation of a “for all” statement 
is a “there exists” statement. 
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EXAMPLE 1.25 
Look at the statement 


There exists a real number which is greater than 5 
In fact there are lots of numbers which are greater than 5; some examples are 7, 42, 


27x, and 97/3. Other numbers, such as 1, 2, and 7/6, are not greater than 5. Since 
there is at least one number satisfying the assertion, the assertion is true. 














EXAMPLE 1.26 
Consider the statement 


There is a man who is at least 10 feet tall 


This statement is false. To verify that it is false, we must demonstrate that there 
does not exist aman who is at least 10 feet tall. In other words, we must show that 
all men are shorter than 10 feet. 

The negation of a “there exists” statement is a “for all” statement. 

A somewhat different example is the sentence 


There exists a real number which satisfies the equation 


x? — 2x7 4+ 3x —6=0 


There is in fact only one real number which satisfies the equation, and that is x = 2. 
Yet that information is sufficient to show that the statement true. 














We often use the symbol V to denote “for all” and the symbol J to denote “there 
exists.” The assertion 


Vx,x+1l<x 


claims that for every x, the number x + 1 is less than x. If we take our universe to 
be the standard real number system, then this statement is false. The assertion 


Jx, x? =x 
claims that there is a number whose square equals itself. If we take our universe to 
be the real numbers, then the assertion is satisfied by x = 0 and by x = 1. Therefore 
the assertion is true. 
In all the examples of quantifiers that we have discussed so far, we were careful 
to specify our universe (or at least the universe was clear from context). That is, 
“There is a woman such that ...” or “All positive real numbers are ...” or “All 
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automobiles have ....” The quantified statement makes no sense unless we specify 
the universe of objects from which we are making our specification. In the discussion 
that follows, we will always interpret quantified statements in terms of a universe. 
Sometimes the universe will be explicitly specified, while other times it will be 
understood from context. 

Quite often we will encounter Y and J used together. The following examples 
are typical: 


EXAMPLE 1.27 
The statement 


Vx dy,y>x 
claims that, for any real number x, there is a number y which is greater than it. In 
the realm of the real numbers this is true. In fact y = x + 1 will always do the trick. 
The statement 


dxVy,y>x 


has quite a different meaning from the first one. It claims that there is an x which 
is less than every y. This is absurd. For instance, x is not less than y = x — 1. 














EXAMPLE 1.28 
The statement 


Wx Vy, x +y >0 


is true in the realm of the real numbers: it claims that the sum of two squares is 
always greater than or equal to zero. (This statement happens to be false in the realm 
of the complex numbers. When we interpret a logical statement, it will always be 
important to understand the context, or universe, in which we are working.) 

The statement 


dxdy,x+2y=7 


is true in the realm of the real numbers: it claims that there exist x and y such that 
x +2y = 7. Certainly the numbers x = 3, y = 2 will do the job (although there 
are many other choices that work as well). 














It is important to note that Y and 3 do not commute. That is to say, Va and 3Y 
do not mean the same thing. Examine Example 1.27 with this thought in mind to 
make sure that you understand the point. 
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We conclude by noting that V and 4 are closely related. The statements 
Yx, B(x) and ~ 3x, ~ B(x) 


are logically equivalent. The first asserts that the statement B(x) is true for all values 
of x. The second asserts that there exists no value of x for which B(x) fails, which 
is the same thing. 

Likewise, the statements 


Jx, B(x) and ~ Vx, ~ B(x) 


are logically equivalent. The first asserts that there is some x for which B(x) is true. 
The second claims that it is not the case that B(x) fails for every x, which is the 
same thing. 

A “for all” statement is something like the conjunction of a very large number 
of simpler statements. For example, the statement 


For every nonzero integer n, n° > 0 


is actually an efficient way of saying that 17 > 0 and (—1)? > 0 and 2? > 0, and 
so on. It is not feasible to apply truth tables to “for all” statements, and we usually 
do not do so. 

A “there exists” statement is something like the disjunction of a very large 
number of statements (the word “disjunction” in the present context means an “or” 
statement). For example, the statement 


There exists an integer n such that P (n) = 2n? —5n+2=0 


is actually an efficient way of saying that “P(1) = 0 or P(—1) = 0 or P(2) = 0, 
and so on.” It is not feasible to apply truth tables to “there exist” statements, and 
we usually do not do so. 

It is common to say that first-order logic consists of the connectives A, V, ~, >, 
<=> , the equality symbol =, and the quantifiers Y and J, together with an infinite 
string of variables x, y,z,...,x’, y’,z’,... and, finally, parentheses (, , ) to keep 
things readable. The word “first” here is used to distinguish the discussion from 
second-order and higher-order logics. In first-order logic the quantifiers Y and 3 
always range over elements of the domain M of discourse. Second-order logic, by 
contrast, allows us to quantify over subsets of M and functions F mapping M x M 
into M. Third-order logic treats sets of function and more abstract constructs. The 
distinction among these different orders is often moot. 
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Exercises 


1. Construct truth tables for each of the following sentences: 


a. (SAT)V~ SVT) 
b. (SVT) > SAT) 
2. Let 
S = All fish have eyelids. 
T = There is no justice in the world. 
U = I believe everything that I read. 
V = The moon’s a balloon. 
Express each of the following sentences using the letters S, T, U, V and the 
connectives V, A, ~, >, & . Do not use quantifiers. 
a. If fish have eyelids then there is at least some justice in the world. 
b. If I believe everything that I read then either the moon’s a balloon or at 
least some fish have no eyelids. 
3. Let 
S = All politicians are honest. 
T = Some men are fools. 
U = I don’t have two brain cells to rub together. 
W = The pie is in the sky. 
Translate each of the following into English sentences: 


a. (SA~T) >~ U 
b. Wv (TA ~ U) 

4. State the converse and the contrapositive of each of the following sentences. 
Be sure to label each. 

a. In order for it to rain it is necessary that there be clouds. 

b. In order for it to rain it is sufficient that there be clouds. 

5. Assume that the universe is the ordinary system R of real numbers. Which 
of the following sentences is true? Which is false? Give reasons for your 


answers. 


a. If x is rational then the area of a circle is E = mc?. 


b. If 2+ 2 = 4 then 3/5 is a rational number. 


CHAPTER 1 Logic 21 


6. For each of the following statements, formulate a logically equivalent one 
using only S, T, ~, and v. (Of course you may use as many parentheses as 
you need.) Use a truth table or other means to explain why the statements 
are logically equivalent. 

a. S>~T 
b. ~SA~T 

7. For each of the following statements, formulate an English sentence that is 
its negation: 

a. The set S contains at least two integers. 


b. Mares eat oats and does eat oats. 


8. Which of these pairs of statements is logically equivalent? Why? 


(a) AV~B ~A>B 
(b) AA~B ~A>~B 
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CHAPTER 2 





Methods of 
Mathematical Proof 


2.1 What Is a Proof? 


When a chemist asserts that a substance that is subjected to heat will tend to expand, 
he/she verifies the assertion through experiment. It is a consequence of the definition 
of heat that heat will excite the atomic particles in the substance; it is plausible that 
this in turn will necessitate expansion of the substance. However, our knowledge of 
nature is not such that we may turn these theoretical ingredients into a categorical 
proof. Additional complications arise from the fact that the word “expand” requires 
detailed definition. Apply heat to water that is at temperature 40 degree Fahrenheit 
or above, and it expands—with enough heat it becomes a gas that surely fills more 
volume than the original water. But apply heat to a diamond and there is no apparent 
“expansion’’—at least not to the naked eye. 

Mathematics is a less ambitious subject. In particular, it is closed. It does not 
reach outside itself for verification of its assertions. When we make an assertion in 
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mathematics, we must verify it using the rules that we have laid down. That is, we 
verify it by applying our rules of logic to our axioms and our definitions; in other 
words, we construct a proof. 

In modern mathematics we have discovered that there are perfectly sensible 
mathematical statements that in fact cannot be verified in this fashion, nor can 
they be proven false. This is a manifestation of Gédel’s incompleteness theorem: 
that any sufficiently complex logical system will contain such unverifiable, indeed 
untestable, statements. Fortunately, in practice, such statements are the exception 
rather than the rule. In this book, and in almost all of university-level mathematics, 
we concentrate on learning about statements whose truth or falsity is accessible by 
way of proof. 

This chapter considers the notion of mathematical proof. We shall concentrate 
on the three principal types of proof: direct proof, proof by contradiction, and proof 
by induction. In practice, a mathematical proof may contain elements of several or 
all of these techniques. You will see all the basic elements here. You should be sure 
to master each of these proof techniques, both so that you can recognize them in 
your reading and so that they become tools that you can use in your own work. 


2.2 Direct Proof 


In this section we shall assume that you are familiar with the positive integers, or 
natural numbers (a detailed treatment of the natural numbers appears in Sec. 5.2). 
This number system {1, 2, 3, . . .} is denoted by the symbol N. For now we will take 
the elementary arithmetic properties of N for granted. We shall formulate various 
statements about natural numbers and we shall prove them. Our methodology will 
emulate the discussions in earlier sections. We begin with a definition. 


Definition 2.1 A natural number n is said to be even if, when it is divided by 2, 
there is an integer quotient and no remainder. 


Definition 2.2 A natural number n is said to be odd if, when it is divided by 2, 
there is an integer quotient and remainder 1. 


You may have never before considered, at this level of precision, what is the 
meaning of the terms “odd” or “even.” But your intuition should confirm these 
definitions. A good definition should be precise, but it should also appeal to your 
heuristic idea about the concept that is being defined. 

Notice that, according to these definitions, any natural number is either even or 
odd. For if n is any natural number, and if we divide it by 2, then the remainder 
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will be either 0 or 1—1there is no other possibility (according to the Euclidean 
algorithm). In the first instance, n is even; in the second, n is odd. 

In what follows we will find it convenient to think of an even natural number 
as one having the form 2m for some natural number m. We will think of an odd 
natural number as one having the form 2k + 1 for some nonnegative integer k. 
Check for yourself that, in the first instance, division by 2 will result in a quotient 
of m and a remainder of 0; in the second instance it will result in a quotient of k 
and a remainder of 1. 

Now let us formulate a statement about the natural numbers and prove it. Fol- 
lowing tradition, we refer to formal mathematical statements either as theorems 
or propositions or sometimes as lemmas. A theorem is supposed to be an impor- 
tant statement that is the culmination of some development of significant ideas. A 
proposition is a statement of lesser intrinsic importance. Usually a lemma is of no 
intrinsic interest, but is needed as a step along the way to verifying a theorem or 
proposition. 


Proposition 2.1 The square of an even natural number is even. 


Proof: Let us begin by using what we learned in Chap. 1. We may reformulate 
our statement as “If n is even then n - n is even.” This statement makes a promise. 
Refer to the definition of “even” to see what that promise is: 

If n can be written as twice a natural number then n - n can be written as twice 
a natural number. 

The hypothesis of the assertion is that n = 2 - m for some natural number m. But 
then 


n? =n -n = (2m)- (2m) = 4m? = 2(2m’) 


Our calculation shows that n? is twice the natural number 27. So n? is also even. 

We have shown that the hypothesis that n is twice a natural number entails the 
conclusion that n? is twice a natural number. In other words, if n is even then n? is 
even. That is the end of our proof. 














Remark 2.1 What is the role of truth tables at this point? Why did we not use a 
truth table to verify our proposition? One could think of the statement that we are 
proving as the conjunction of infinitely many specific statements about concrete 
instances of the variable n; and then we could verify each one of those statements. 
But such a procedure is inelegant and, more importantly, impractical. 

For our purpose, the truth table tells us what we must do to construct a proof. 
The truth table for A => B shows that if A is false then there is nothing to check 
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whereas if A is true then we must show that B is true. That is just what we did in 
the proof of Proposition 2.1. 

Most of our theorems are “for all” statements or “there exists” statements. In 
practice, it is not usually possible to verify them directly by use of a truth table. 


Proposition 2.2 The square of an odd natural number is odd. 


Proof: We follow the paradigm laid down in the proof of the previous proposition. 
Assume that n is odd. Then n = 2k + 1 for some nonnegative integer k. But then 


n=n-n=(2k4+1)- (2k +1) = 4k? + 4k + 1 = 202k? + 2k) +1 


We see that n? is 2k’ + 1, where k’ = 2k? + 2k. In other words, according to our 
definition, n? is odd. 














Both of the proofs that we have presented are examples of “direct proof.” A 
direct proof proceeds according to the statement being proved; for instance, if 
we are proving a statement about a square then we calculate that square. If we 
are proving a statement about a sum then we calculate that sum. Here are some 
additional examples: 


EXAMPLE 2.1 
Prove that, if n is a positive integer, then the quantity n? + 3n + 2 is even. 


Solution: Denote the quantity n? + 3n + 2 by K. Observe that 
K =n? +3n+2=(n+1)(n+2) 


Thus K is the product of two successive integers: n + 1 and n + 2. One of those 
two integers must be even. So it is a multiple of 2. Therefore K itself is a multiple 
of 2. Hence K must be even. 














You Try It: Prove that the cube of an odd number must be odd. 


Proposition 2.3 The sum of two odd natural numbers is even. 


Proof: Suppose that p and q are both odd natural numbers. According to the 
definition, we may write p = 2r + 1 andg = 2s + 1 for some nonnegative integers 
r ands. Then 


pt+t¢g=@r4+)4+@s4+1)=2r4+25s4+2=2(7r+5+4+1) 


We have realized p + q as twice the natural number r + s + 1. Therefore p + q is 
even. 
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Remark 2.2 If we did mathematics solely according to what sounds good, or what 
appeals intuitively, then we might reason as follows: “if the sum of two odd natural 
numbers is even then it must be that the sum of two even natural numbers is odd.” 
This is incorrect. For instance 4 and 6 are each even but their sum 4 + 6 = 10 is 
not odd. 

Intuition definitely plays an important role in the development of mathematics, 
but all assertions in mathematics must, in the end, be proved by rigorous methods. 


EXAMPLE 2.2 
Prove that the sum of an even integer and an odd integer is odd. 


Solution: An even integer is divisible by 2, so may be written in the form e = 2m, 
where m is an integer. An odd integer has remainder 1 when divided by 2, so may 
be written in the form o = 2k + 1, where k is an integer. The sum of these is 


eto=m+(2k+1)=2%m+hH+1 


Thus we see that the sum of an even and an odd integer will have remainder 1 when 
it is divided by 2. As a result, the sum is odd. 














Proposition 2.4 The sum of two even natural numbers is even. 
Proof: Let p = 2r and q = 2s both be even natural numbers. Then 
p+q=2r +2s =2(r +s) 


We have realized p + q as twice a natural number. Therefore we conclude that 
p +q is even. 














Proposition 2.5 Let n be a natural number. Then either n > 6orn <9. 


Proof: If you draw a picture of a number line then you will have no trouble 
convincing yourself of the truth of the assertion. What we want to learn here is to 
organize our thoughts so that we may write down a rigorous proof. 

Our discussion of the connective “or” in Sec. 1.2 will now come to our aid. Fix 
a natural number n. If n > 6 then the “or” statement is true and there is nothing to 
prove. If n } 6, then the truth table teaches us that we must check that n < 9. But 
the statement n 4 6 means that n < 6 so we have 


n<6<9 











That is what we wished to prove. 
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EXAMPLE 2.3 
Prove that every even integer may be written as the sum of two odd integers. 


Solution: Let the even integer be K = 2m, for m an integer. If m is odd then we 
write 


K=2m=m+m 


and we have written K as the sum of two odd integers. If, instead, m is even, then 
we write 


K =2m = (m—1)+(m+4+1) 


Since m is even then both m — 1 and m + 1 are odd. So again we have written K 
as the sum of two odd integers. 














EXAMPLE 2.4 
Prove the Pythagorean theorem. 


Solution: The Pythagorean theorem states that c? = a? + b*, where a and b are 
the legs of a right triangle and c is its hypotenuse. See Fig. 2.1. 

Consider now the arrangement of four triangles and a square shown in Fig. 2.2. 
Each of the four triangles is a copy of the original triangle in Fig. 2.1. We see that 
each side of the all-encompassing square is equal to c. So the area of that square 
is c?. Now each of the component triangles has base a and height b. So each such 


Figure 2.1 The pythagorean theorem. 
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Figure 2.2 Proof of the pythagorean theorem. 


triangle has area ab/2. And the little square in the middle has side b — a. So it has 
area (b — a)? = b? — 2ab + a”. We write the total area as the sum of its component 
areas: 


b 
a4 [P4200 +a] =a +0" 


That is the desired equality. 














In this section and the next two we are concerned with form rather than 
substance. We are not interested in proving anything profound, but rather in show- 
ing you what a proof looks like. Later in the book we shall consider some deeper 
mathematical ideas and correspondingly more profound proofs. 


2.3 Proof by Contradiction 


Aristotelian logic dictates that every sensible statement has a truth value: TRUE 
or FALSE. If we can demonstrate that a statement A could not possibly be false, 
then it must be true. On the other hand, if we can demonstrate that A could not be 
true, then it must be false. Here is a dramatic example of this principle. In order to 
present it, we shall assume for the moment that you are familiar with the system 
Q of rational numbers. These are numbers that may be written as the quotient of 
two integers (without dividing by zero, of course). We shall discuss the rational 
numbers in greater detail in Sec. 5.4. 
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Theorem 2.1 (Pythagoras) There is no rational number x with the property that 
x? =2., 
Proof: In symbols (refer to Chap. 1), our assertion may be written 
~ Ex, (x € QA x? = 2)] 
Let us assume the statement to be false. Then what we are assuming is that 
Ax, (x € QA x* =2) (2.1) 

Since x is rational we may write x = p/q, where p and q are integers. 

We may as well suppose that both p and q are positive and nonzero. After 
reducing the fraction, we may suppose that it is in lowest terms—so p and q have 


no common factors. 
Now our hypothesis asserts that 


x =2 
or 
p\? 
ey 
q 
We may write this out as 
p= 24? (2.2) 


Observe that this equation asserts that p° is an even number. But then p must be an 
even number (p cannot be odd, for that would imply that p? is odd by Proposition 
2.2). So p = 2r for some natural number r. 

Substituting this assertion into Eq. (2.2) now yields that 


(2r)* = 2g? 
Simplifying, we may rewrite our equation as 
2 


ar? =q 


This new equation asserts that q? is even. But then q itself must be even. 
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We have proven that both p and q are even. But that means that they have a 
common factor of 2. This contradicts our starting assumption that p and q have no 
common factor. 














Let us pause to ascertain what we have established: the assumption that a rational 
square root x of 2 exists, and that it has been written in lowest terms as x = p/q, 
leads to the conclusion that p and g have a common factor and hence are not in 
lowest terms. What does this entail for our logical system? 

We cannot allow a statement of the form C = A A ~ A (in the present context the 
statement A is “x = p/q in lowest terms”). For such a statement C must be false. 

But if x exists then the statement C is true. No statement (such as A) can have 
two truth values. In other words, the statement C must be false. The only possible 
conclusion is that x does not exist. That is what we wished to establish. 


Remark 2.3 In practice, we do not include the last three paragraphs in a proof 
by contradiction. We provide them now because this is our first exposure to such 
a proof, and we want to make the reasoning absolutely clear. The point is that 
the assertions A and ~ A cannot both be true. An assumption that leads to this 
eventuality cannot be valid. That is the essence of proof by contradiction. 


Historically, Theorem 2.1 was extremely important. Prior to Pythagoras (~ 300 
B.C.E.), the ancient Greeks (following Eudoxus) believed that all numbers (at least 
all numbers that arise in real life) are rational. However, by the Pythagorean theorem, 
the length of the diagonal of a unit square is a number whose square is 2. And our 
theorem asserts that such a number cannot be rational. We now know that there are 
many nonrational, or irrational numbers. 

Here is a second example of a proof by contradiction: 


Theorem 2.2 (Dirichlet) Suppose that n+ 1 pieces of mail are delivered to n 
mailboxes. Then some mailbox contains at least two pieces of mail. 


Proof: Suppose that the assertion is false. Then each mailbox contains either zero 
or one piece of mail. But then the total amount of mail in all the mailboxes cannot 
exceed 


1+14---+1 
— 


n times 


In other words, there are at most n pieces of mail. That conclusion contradicts the 
fact that there are n + 1 pieces of mail. We conclude that some mailbox contains 
at least two pieces of mail. 
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Figure 2.3 Points at random in the unit interval. 


The last theorem, due to Gustav Lejeune Dirichlet (1805—1859), was classically 
known as the Dirichletscher Schubfachschluss. This German name translates to 
“Dirichlet’s drawer shutting principle.” Today, at least in this country, it is more 
commonly known as “the pigeonhole principle.” Since pigeonholes are no longer 
a common artifact of everyday life, we have illustrated the idea using mailboxes. 


EXAMPLE 2.5 
Draw the unit interval Z in the real line. Now pick 11 points at random from 
that interval (imagine throwing darts at the interval, or dropping ink drops on the 
interval). See Fig. 2.3. Then some pair of the points has distance not greater than 
0.1 inch. 

To see this, write 


I = [0, 0.1] U [0.1, 0.2] U---[0.8, 0.9] U [0.9, 1] 


Here we have used standard interval notation. Think of each of these subintervals as 
a mailbox. We are delivering 11 letters (that is, the randomly selected points) to these 
10 mailboxes. By the pigeonhole principle, some mailbox must receive two letters. 

We conclude that some subinterval of Z, having length 0.1, contains two of the 
randomly selected points. Thus their distance does not exceed 0.1 inch. 














2.4 Proof by Induction 


The logical validity of the method of proof by induction is intimately bound up 
with the construction of the natural numbers, with ordinal arithmetic, and with the 
so-called well-ordering principle. We shall not treat those logical niceties here, but 
shall instead concentrate on the technique. As with any good idea in mathematics, 
we shall be able to make it intuitively clear that the method is a valid and useful 
one. So no confusion should result. 

Consider a statement P (n) about the natural numbers. For example, the statement 
might be “The quantity n? + 5n +6 is always even.” If we wish to prove this 
statement, we might proceed as follows: 


1. Prove the statement P (1). 
2. Prove that P(k) > P (k + 1) for every k € {1,2,...}. 
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Let us apply the syllogism modus ponendo ponens from the end of Sec. 1.5 to 
determine what we will have accomplished. We know P (1) and, from Step (2) with 
k = 1, that P(1) > P (2). We may therefore conclude P (2). Now Step (2) with 
k = 2 says that P(2) > P(3). We may then conclude P(3). Continuing in this 
fashion, we may establish P (n) for every natural number n. 

Notice that this reasoning applies to any statement P (n) for which we can es- 
tablish Steps (1) and (2) above. Thus Steps (1) and (2) taken together constitute a 
method of proof. It is a method of establishing a statement P (n) for every natural 
number n. The method is known as proof by induction. 


EXAMPLE 2.6 
Let us use the method of induction to prove that, for every natural number n, the 
number n? + 5n + 6 is even. 


Solution: Our statement P (n) is 
The number n? + 5n + 6 is even 


[Note : Explicitly identifying P (n) is more than a formality. Always record carefully 
what P (n) is before proceeding. | 

We now proceed in two steps: 
P(1) is true. When n = 1 then 


n? +5n+6=1°+5-1+6= 12 


and this is certainly even. We have verified P (1). 
P(n) => P(n + 1)istrue. We are proving an implication in this step. We assume 
P (n) and use it to establish P (n + 1). Thus we are assuming that 


n? + 5n +6 =2m 
for some natural number m. Then, to check P (n + 1), we calculate 


(n+ 1)? +5(n+1)+6= [n? + 2n + 1] + [5n+5]+6 
= [n? + 5n + 6] + [2n + 6] 
= 2m + [2n + 6] 
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Notice that in the last step we have used our hypothesis that n? + 5n + 6 is even, 
that is that n? + 5n + 6 = 2m. Now the last line may be rewritten as 


2(m +n +3) 


Thus we see that (n + 1)? + 5(n + 1) + 6 is twice the natural number m + n + 3. 
In other words, (n + 1)? + 5(n + 1) + 6is even. But that is the assertion P (n + 1). 

In summary, assuming the assertion P(n) we have established the assertion 
P(n + 1). That completes Step (2) of the method of induction. We conclude that 
P (n) is true for every n. 














Here is another example to illustrate the method of induction. This formula is 
often attributed to Carl Friedrich Gauss (and alleged to have been discovered by 
him when he was 11 years old). 


Proposition 2.6 [fn is any natural number then 


1 
eR ree ee aa 
2 
Proof: The statement P (n) is 
1 


Now let us follow the method of induction closely. 
P(1) is true. The statement P (1) is 


_1d+b 1) 
a a 
This is plainly true. 


P(n) = P(n + 1) is true. We are proving an implication in this step. We assume 
P (n) and use it to establish P (n + 1). Thus we are assuming that 


1 
1424+ = EED (2.3) 


Let us add the quantity (n + 1) to both sides of Eq. (2.3). We obtain 


n(n + 1) 
a io rm a) 
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The left side of this last equation is exactly the left side of P (n + 1) that we are 
trying to establish. That is the motivation for our last step. 
Now the right-hand side may be rewritten as 


n(n+1)+2(n+1) 
2 





This simplifies to 


(a + I)@ +2) 
2 


In conclusion, we have established that 


n+1)m+2 
eae ee x 
This is the statement P (n + 1). 

Assuming the validity of P (n), we have proved the validity of P (n + 1). That 
completes the second step of the method of induction, and establishes P (n) for 
all n. 














Some problems are formulated in such a way that it is convenient to begin the 
induction with some value of n other than n = 1. The next example illustrates this 
notion: 


EXAMPLE 2.7 
Let us prove that, for n > 4, we have the inequality 


3” > 2n? + 3n 
Solution: The statement P (n) is 
3" > 2n? + 3n 


P(4) is true. Observe that the inequality is false for n = 1, 2, 3. However, for n = 4 
it is certainly the case that 


34 =81>44=2-474+3.4 
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P(n) = P(n + 1) is true. Now assume that P (n) has been established and let us 
use it to prove P(n + 1). We are hypothesizing that 
3" > 2n? + 3n 

Multiplying both sides by 3 gives 

3-3" > 3(2n? + 3n) 
or 

3"! > 6n? +9n 

But now we have 


3”+1 > 6n? + 9n 
=2(n? + 2n +n) + (4n? + 3n) 
> 2(n? +2n +1) + Bn +3) 
=2(n +1) +34) 


This inequality is just P (n + 1), as we wished to establish. That completes Step (2) 
of the induction, and therefore completes the proof. 














We conclude this section by mentioning an alternative form of the induction 
paradigm which is sometimes called complete mathematical induction or strong 
mathematical induction. 


2.4.1 COMPLETE MATHEMATICAL INDUCTION 


Let P be a function on the natural numbers. 


1. P(1); 
2. [P(/) forall j < n] = P (n + 1) for every natural number n; 


then P (n) is true for every n. 

It turns out that the complete induction principle is logically equivalent to the 
ordinary induction principle enunciated at the outset of this section. But in some 
instances strong induction is the more useful tool. An alternative terminology for 
complete induction is “the set formulation of induction.” 
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Complete induction is sometimes more convenient, or more natural, to use than 
ordinary induction; it finds particular use in abstract algebra. Complete induction 
also is a simple instance of transfinite induction, which you will encounter in a 
more advanced course. 


EXAMPLE 2.8 
Every integer greater than 1 is either prime or the product of primes. (Here a prime 
number is an integer whose only factors are 1 and itself.) 


Solution: We will use strong induction, just to illustrate the idea. For convenience 
we begin the induction process at the index 2 rather than at 1. 

Let P (n) be the assertion “Either n is prime or n is the product of primes.” Then 
P (2) is plainly true since 2 is the first prime. Now assume that P (j) is true for 
2 < j <n and consider P(n + 1). If n + 1 is prime then we are done. If n + 1 
is not prime then n + 1 factors as n + 1 = k - £, where k, £ are integers less than 
n + 1, but at least 2. By the strong inductive hypothesis, each of k and £ factors as 
a product of primes (or is itself a prime). Thus n + 1 factors as a product of primes. 

The complete induction is done, and the proof is complete. 














2.5 Other Methods of Proof 


We give here a number of examples that illustrate proof techniques other than direct 
proof, proof by contradiction, and induction. 


2.5.1 COUNTING ARGUMENTS 


EXAMPLE 2.9 
Show that if there are 23 people in a room then the odds are better than even that 
two of them have the same birthday. 


Solution: The best strategy is to calculate the odds that no two people have the 
same birthday, and then to take complements. 

Let us label the people p1, p2, ..., p23. Then, assuming that none of the p; have 
the same birthday, we see that pı can have his birthday on any of the 365 days in 
the year, p2 can then have his birthday on any of the remaining 364 days, p3 can 
have his birthday on any of the remaining 363 days, and so forth. So the number of 
different ways that these 23 people can all have different birthdays is 


365 - 364 - 363 --- 345 - 344 - 343 
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On the other hand, the number of ways that birthdays could be distributed (with no 
restrictions) among 23 people is 


365 - 365 - 365 - - -365 = 3652 


23 times 
Thus the probability that these 23 people all have different birthdays is 


365 -364 - 363 -- -343 
re 36523 





A quick calculation with a calculator shows that p ~ 0.4927 < 0.5. That is the 
desired result. 














EXAMPLE 2.10 
Jill is dealt a poker hand of 5 cards from a standard deck of 52. What is the probabi- 
lity that she holds four of a kind? 


Solution: If the hand holds 4 aces, then the fifth card is any one of the other 48 
cards. If the hand holds 4 kings, then the fifth card is any one of the other 48 cards. 
And so forth. So there are a total of 


13 x 48 = 624 


possible hands with four of a kind. The total number of possible 5-card hands is 
52 
5) = 2598960 


Therefore the probability of holding 4 of a kind is 


624 


= ———— = 0.00024 
2598960 ae 


P 














2.5.2 OTHER ARGUMENTS 


EXAMPLE 2.11 
Let us show that there exist irrational numbers a and b such that a” is rational. 
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Solution: Let œ = /2 and 6 = V2. If œf is rational then we are done, using 
a = «a and b = $. If a is irrational, then observe that 


ger? = gP ~A = g? = [V2]? =2 


Thus, with a = a and b = V2 we have found two irrational numbers a, b such 
that a? = 2 is rational. 














EXAMPLE 2.12 

Show that if there are six people in a room then either three of them know each 
other or three of them do not know each other. (Here three people know each other 
if each of the three pairs has met. Three people do not know each other if each of 
the three pairs has not met.) 


Solution: The tedious way to do this problem is to write out all possible “acquain- 
tance assignments” for 6 people. 

We now describe a more efficient, and more satisfying, strategy. Call one of the 
people Bob. There are five others. Either Bob knows three of them, or he does not 
know three of them. 

Say that Bob knows three of the others. If any two of those three are acquainted, 
then those two and Bob form a mutually acquainted threesome. If no two of those 
three know each other, then those three are a mutually unacquainted threesome. 

Now suppose that Bob does not know three of the others. If any two of those 
three are unacquainted, then those two and Bob form an unacquainted threesome. 
If all pairs among the three are instead acquainted, then those three form a mutually 
acquainted threesome. 

We have covered all possibilities, and in every instance come up either with a 
mutually acquainted threesome or a mutually unacquainted threesome. That ends 
the proof. 














It may be worth knowing that five people is insufficient to guarantee either a 
mutually acquainted threesome or a mutually unacquainted threesome. We leave it 
to the reader to provide a suitable counterexample. It is quite difficult to determine 
the minimal number of people to solve the problem when “threesome” is replaced by 
“foursome.” When “foursome” is replaced by five people, the problem is considered 
to be grossly intractable. This problem is a simple example from the mathematical 
subject known as Ramsey theory. 
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Exercises 


1. 
2. 


Prove that the product of two odd natural numbers must be odd. 


Prove that if n is an even natural number and if m is any natural number then 
n-m must be even. 


. Prove that the sum of the squares of the first n natural numbers is equal to 


2n? +3n? +n 
6 
. Prove that if m is a power of 3 and n is a power of 3 then m + n is never a 
power of 3. 


. Prove that if n is a natural number and if n has a rational square root then in 


fact the square root of n is an integer. 


. Prove that if the natural number n is a perfect square then n + 1 will never 


be a perfect square. 


. A popular recreational puzzle hypothesizes that you have nine pearls that 


are identical in appearance. Eight of these pearls have the same weight, but 
the ninth is either heavier or lighter—you do not know which. You have a 
balance scale, and are allowed three weighings to find the odd pearl. How 
do you proceed? 

Now here is a bogus proof by induction that you can solve the problem 
in the first paragraph in three weighings not just for nine pearls but for any 
number of pearls. For convenience let us begin the induction with the case 
n = 9 pearls. By the result of the first paragraph, we can handle that case. 
Now, inductively, suppose that we have an algorithm for handling n pearls. 
We use this hypothesis to treat (n + 1) pearls. From the (n + 1) pearls, 
remove one and put it in your pocket. There remain n pearls. Apply the n- 
pearl algorithm to these remaining pearls. If you find the odd pearl then you 
are done. If you do not find the odd pearl, then it is the one in your pocket. 
That completes the case (n + 1) and the proof. 

What is the flaw in this reasoning? [Note: If you are fiendishly clever, then 
you can actually handle 12 pearls in the original problem—with just three 
weighings. However, this requires the consideration of 27 cases.] 


. Prove that if k is a natural number that is greater than 2 then 2% > 1 + 2k. 
. Prove the pigeonhole principle by induction. 
10. 


You write 27 letters to 27 different people. Then you address the 27 envelopes. 
You close your eyes and stuff one letter into each envelope. What is the 
probability that just one letter is in the wrong envelope? 


CHAPTER 3 





Set Theory 


3.1 Rudiments 


Even the most elementary considerations in logic may lead to conundrums. Suppose 
that we wish to define the notion of “line.” We might say that it is the shortest path 
between two points. This is not completely satisfactory because we have not yet 
defined “path” or “point.” And when we say “the shortest path” do we mean that 
there is just one unique shortest path? And why does it exist? Every new definition 
is, perforce, formulated in terms of other ideas. And those ideas in terms of other 
ones. Where does the regression cease? 

The accepted method for dealing with this problem is to begin with certain terms 
(as few as possible) that are agreed to be “undefinable.” These terms should be 
so simple that there can be little argument as to their meaning. But it is agreed in 
advance that these undefinable terms simply cannot be defined in terms of ideas 
that have been previously defined. Our undefined terms are our starting place. 

In modern mathematics it is customary to use “set” and “element of” as unde- 
finables. A set is declared to be a collection of objects. (Please do not ask what 
an “object” is or what a “collection” is; when we say that the term “set” is an 
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undefinable then we mean just that.) If S is a set then we say that x is an element of 
S, and we write x € S or S > x, precisely when x is one of the objects that compose 
the set S. For example, we write 5 € N to indicate that the number 5 is an element 
of the set of natural numbers. We write —7 ¢ N to specify that —7 is not an element 
of the set of natural numbers. 


Definition 3.1 We say that two sets S and T are equal precisely when they have the 
same elements. We write S = T. Alternatively, we say that S = T provided that 
both x € S >x ET andxeT >xes. 


As an example of equality of sets, if S = {x € N : x? > 3} and T = {x EN: 
x> 2) then S =T. 

Incidentally, the method of specifying a set with the notation {x : P (x)}, where 
P denotes a property, is the most common method in mathematics of defining a 
set. This is sometimes called “setbuilder notation.” 

We shall endeavor, in what follows, to formulate all of our set-theoretic notions 
in a rigorous and logical fashion from the undefinables “set” and “element of.” If at 
any point we were to arrive at an untenable position, or a logical contradiction, or a 
fallacy,! then we know that the fault lies with either our method of reasoning or with 
our undefinables or with our axioms. One of the advantages of the way that we do 
mathematics is that if there is ever trouble then we know where the trouble must lie. 


3.2 Elements of Set Theory 


Beginning in this section, we will be doing mathematics in the way that it is usually 
done. That is, we shall define terms and we shall state and prove properties that 
they satisfy. In earlier chapters we were careful, but we were less mathematical. 
Sometimes we even had to say “This is the way we do it; don’t worry.” Many of 
the topics in Chaps. 1 and 2 are really only best understood from the advanced 
perspectives of mathematical logic. Now, and for the rest of this book, it is time to 
show how mathematics is done in practice. 

We use theorems, propositions, and lemmas to formulate our ideas. The device 
of proofs is used to validate those ideas. Another formal ingredient of mathematical 
exposition is the “definition.” A definition usually introduces a new piece of termi- 
nology or a new idea and explains what it means in terms of ideas and terminology 
that have already been presented. As you read this chapter, pause frequently to 
check that we are following this paradigm. 





'The reader should not worry. This is well-trodden ground, and there are no pitfalls lurking ahead. 
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Definition 3.2 Let S and T be sets. We say that S is a subset of T , and we write 
ScTorT DS, if 


xeS>axeT 


We donot prove our definitions. There is nothing to prove. A definition introduces 
you to a new idea, or piece of terminology, or piece of notation. 


EXAMPLE 3.1 
Let S = {x e€ N: x > 3} andT = {x € N: x? > 4}. Determine whether § C T or 
TCS. 


Solution: The key to success and clarity in handling subset questions is to use the 
definition. To see whether S C T we must check whether x € S implies x € T. Now 
if x € S then x > 3 hence x? > 9 so certainly x? > 4. Our syllogism is proved, 
and we conclude that $ C T. 

The reverse inclusion is false. For example, the number 3 is an element of T but 
is certainly not an element of S. We write T Z S. 














EXAMPLE 3.2 
Let Z denote the set of integers. Let S = {—2, 3}. Let T= {x eZ: x? — x? 
6x = 0}. Determine whether S CT orT C S. 


Solution: To see whether S C T we must check whether x € S implies x € T. 

Let x € S. Then either x = —2 or x = 3. If x = —2 then x? — x? — 6x = (—2)? — 

(—2)? — 6(—2) = 0. Also, if x = 3 then x? — x? — 6x = BP — (3)? — 68) = 0. 

This verifies the syllogism “x € S implies x € T.” Therefore $ C T. 
The reverse inclusion fails, forO € T but O g S. 














You Try It: How are S = {1, 2, 3, 5, 6, 8} and T = {1, 3, 6} related? 


EXAMPLE 3.3 
Let $S = {x EN: x > 4}andT = {x EN: x < 9}. Is it true that either S C T or 
T cS? 


Solution: Both inclusions are false. For 10 € S but 10 ZT and 2€T but 
2gs. 














Proposition 3.1 Let S and T be sets. Then S =T if and only if both S C T and 
T cS. 
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Proof: If S = T then, by definition, S and T have precisely the same elements. 
In particular, this means that x € S implies x € T and also x € T implies x € S. 
That is, S C T andT CS. 

Now suppose that both S C T and T C S. Seeking a contradiction, suppose that 
S Æ T. Then either there is some element of S that is not an element of T or there 
is some element of T that is not an element of S. The first eventuality contradicts 
S C T and the second eventuality contradicts T C $. We conclude that S = T. 














Definition 3.3 We let ø denote the set that contains no elements. That is, Vx, x ¢ Ø. 
We call Ø the empty set. 


EXAMPLE 3.4 
If S is any set then Ø C S. To see this, notice that the statement “if x € Ø then 
x € S” must be true because the hypothesis x € Ø is false. (Check the truth table 
for “if-then” statements.) This verifies that Ø C S. 














EXAMPLE 3.5 
Let S = {x € N: x +2 > 19 and x < 3}. Then S is a sensible set. There are no 
internal contradictions in its definition. But S = Ø. There are no elements in S. 














Definition 3.4 Let S and T be sets. We say that x is an element of S$ A T if both 
x € Sand x € T. We say that x is an element of S U T if either x e€ S orx ET. 

We call S OT the intersection of the sets S and T. We call S UT the union of 
the sets S and T. 


EXAMPLE 3.6 
Let$ = {x €N:2 <x <9}andT ={x EN:5 <x < 14}.ThenSOT ={x € 
N:5 <x < 9}, for these are the points common to both sets. And SU T = {x € 
N: 2 <x < 14}, for these are the points that are either in S or in T or in both. 














Remark 3.1 Observe that the use of “or” in the definition of set union justifies our 


decision to use the “inclusive ‘or’ ” rather than the “exclusive ‘or’ ” in mathematics. 
See also Proposition 3.2 below. 


EXAMPLE 3.7 
Let S={x EN: 1 <x <5} and T={x €N:8 <x < 12}. Then SOT = Ø, 
for the sets S and T have no elements in common. On the other hand, S$ U T = 
{x EN: 1 <x <5o0r8 <x < 12}. 














Definition 3.5 Let S and T be sets. We say that x € S\T if both x € Sandx gT. 
We call S\T the set-theoretic difference of S and T. 
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EXAMPLE 3.8 
Let S= {x EN:2 <x <7} and T = {x €N:5 <x < 10}. Then we see that 
S\T = {x EN:2 <x <5}andT\S = {x EN: 7 <x < 10}. 














Definition 3.6 Suppose that we are studying subsets of a fixed set X. If S C X then 
we use the symbol °S to denote X\S. In this context we sometimes refer to X as 
the universal set. We call °S the complement of S Gn X). 


EXAMPLE 3.9 
Let N be the universal set. Let $ = {x € N: 3 < x < 20}. Then 


“S={xEeN:1 <x <3}U{x EN: 20 <x} 














The next proposition puts our use of the “inclusive or” into context. 
Proposition 3.2 Let X be the universal setand S C X, T C X. Then 


(1) (SUT) =°SN°T 
(2) “(SNT) =SU°T 


Proof: We shall present this proof in detail since it is a good exercise in under- 
standing both our definitions and our method of proof. 

We begin with the proof of Part (1). It is often best to treat the proof of the equality 
of two sets as two separate proofs of containment. (This is why Proposition 3.1 is 
important.) That is what we now do. 

Let x € “(S UT). Then, by definition, x ¢ (S U T). Thus x is neither an element 
of S nor an element of T. So both x € °S and x e °T. Hence x € °S A'T. We 
conclude that “(S UT) C S N°T. Conversely, if x € °S O'T then x ¢ S and x g 
T. Therefore x ¢ (S UT). As a result, x € (SUT). Thus SA'T C°(S UT). 
Summarizing, we have (SUT) = °SN‘T. 

The proof of Part (2) is similar, but we include it for practice. Let x € “(SN T). 
Then, by definition, x ¢ (SMT). Thus x is not both an element of S and an 
element of T. So either x € °S or x € “T. Hence x € °S UT. We conclude that 
(S AT) Cc SS UT. Conversely, if x € °S UT then either x ¢ S orx ¢ T. There- 
fore x Z (SOT). As a result, x € “(S$ OT). Thus SUT CSCS T). Summa- 
rizing, we have “(S AT) =°SU‘T. 
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Figure 3.1 Venn diagram showing AN B. 


The two formulas in the last proposition are often referred to as de Morgan’s 
laws. Compare them with de Morgan’s laws for v and ^ at the end of Sec. 1.3. 


3.3 Venn Diagrams 


We sometimes use a Venn diagram to aid our understanding of set-theoretic re- 
lationships. In a Venn diagram, a set is represented as a region in the plane (for 
convenience, we use rectangles). The intersection A N B of two sets A and B is 
the region common to the two domains (we have shaded that region with dots in 
Fig. 3.1): 

The reader may wish to think about how to represent a union as a Venn diagram. 

Now let A, B, and C be three sets. The Venn diagram in Fig. 3.2 makes it easy 
to see that AN (BUC) = (AN B)U(ANC). 

The Venn diagram in Fig. 3.3 illustrates the fact that 


A\(B UC) = (AB) N (A\C) 





Figure 3.2 Venn diagram showing AN (B U ©). 
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Figure 3.3 Venn diagram showing A\(B U ©). 


A Venn diagram is not a proper substitute for a rigorous mathematical proof. 
However, it can go a long way toward guiding our intuition. 


3.4 Further Ideas in Elementary Set Theory 


Now we learn some new ways to combine sets. 
Definition 3.7 Let S and T be sets. We define S x T to be the set of all ordered 


pairs (s,¢) such that s € S and t € T. The set S x T is called the set-theoretic 
product of S and T. 


EXAMPLE 3.10 
Let S = {1, 2,3} and T = {a, b}. Then 


Sx T = {(1,a), (1, b), 2,4), (2, b), (3, a), (3, b)} 














It is no coincidence that, in the last example, the set S has 3 elements, the set T has 
2 elements, and the set $ x T has 3 x 2 = 6 elements. In fact one can prove that 
if S has k elements and T has £ elements then S$ x T has k - £ elements. Exercise 
3.12 asks you to prove this assertion by induction on k. 

Notice that S x T is a different set from T x S. With S and T as in the last 
example, 


T x S = {(a, 1), (b, 1), (a, 2), (b, 2), (a, 3), (b, 3)} 
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The phrase “ordered pair” means that the pair (a, 1), for example, is distinct from 
the pair (1, a). 

If S is a set then the power set of S is the set of all subsets of S. We denote the 
power set by P(S). 


EXAMPLE 3.11 
Let S = {1, 2, 3}. Then 


P(S) = {{1}, {2}, (3}, (1, 2}, (2, 3}, (1, 3}, (1, 2, 3}, B} 














If the concept of power set is new to you, then you might have been surprised 
to see {1, 2, 3} and Ø as elements of the power set. But they are both subsets of S, 
and they must be listed. 


Proposition 3.3 Let S = {s1,..., S4} be a set. Then P(S) has 2* elements. 


Proof: We prove the assertion by induction on k. 
The inductive statement is 


A set with k elements has power set with 2* elements 


P(1) is true. In this case, $ = {sı} and P(S) = {{s1}, Ø}. Notice that S has k = 1 
element and P(S) has 2‘ = 2 elements. 


P(k) = P(k +1) is true. Assume that any set of the form S = {s5,,..., Sg} has 
power set with 2* elements. Now let T = {f),..., tg, tk+1}. Consider the subset 
T' = {t,,...,t%} of T. By the inductive hypothesis, the power set of T’ has 2* 
elements. 

Now P(T) certainly contains P(T’) (that is, every subset of T’ is also a subset 
of T). But it also contains each of the sets that is obtained by adjoining the element 
ty. to each subset of T’. There are 2" instances of each type of set. Thus the total 
number of subsets of T is 


2k of 2k = k+l 


Notice that we have indeed counted all subsets of T, since any subset either 
contains ¢,41 or it does not. 

Thus, assuming the validity of our assertion for k, we have proved its validity 
for k + 1. That completes our induction and the proof of the proposition. 














We have seen that the operation of set-theoretic product corresponds to the 
arithmetic product of natural numbers. And now we have seen that the operation 
of taking the power set corresponds to exponentiation. 
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Exercises 


1. 


Let S = {1, 2,3,4,5}, T = (3,4,5,7, 8,9}, U = {1,2,3,4, 9}, V = {2, 4, 
6, 8}. Calculate each of the following: 

a. (SUV) NU 

b. (SNT)UU 


2. Let S be any set and let T = Ø. What can you say about $ x T? 


3. Prove the following formulas for arbitrary sets S, T, and U. [Hint: You may 


O oo NDA 


11. 


12. 


find Venn diagrams useful to guide your thinking, but a Venn diagram is not 
a proof.] 

a SO(TUU)=(SNT)U(SNU) 

b. SU(T AU)=(SUT)A(SUU) 


. Draw Venn diagrams to illustrate parts (a) and (b) of Exercise 3.3. 

. Suppose that A C B C C. What is A\B? What is A\C? What is A U B? 

. Describe the set QZ in words. Describe R\Q. 

. Describe Q x R in words. Describe Q x Z. 

. Describe (Q x R)\(Z x Q) in words. 

. Give an explicit description of the power set of S = {a, b, 1, 2}. 

. Let S = {a,b,c,d}, T = {1, 2,3}, and U = {b, 2}. Which of the following 


statements is true? 

a. fa}eS 

b LeT 

c. {b,2} EU 

Write out the power set of each set: 
a. {1, Ø, {a, b}} 

b. {e, A, a} 


Prove using induction on & that if the set S has k elements and the set T has 
l elements then the set S x T has k -/ elements. 
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CHAPTER 4 





Functions and Relations 


4.1 A Word About Number Systems 


For the record, we make a note here of some of the number systems that will be used 
throughout this book. A more detailed and rigorous treatment of these mathematical 
constructs may be found in Chap. 5. 

The most basic and rudimentary number system is the natural numbers. These 
are the counting numbers 1, 2, 3, . . .. There are infinitely many natural numbers. Of 
course the natural numbers are the number system that is most familiar to everyone. 
Everyone must count things—his/her possessions, or change from a purchase, or 
sports scores. Natural numbers are the key to all these processes. We denote the set 
of natural numbers with the symbol N. 

The next level of sophistication is represented by the integers. These are the 
whole numbers which are positive or negative or 0. Symbolically, the integers are 
given by 


bap oy 2p 10 LD, Beene 
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From a mathematical point of view, the integers are more attractive than the natu- 
ral numbers because they are closed under certain arithmetic operations—notably 
substraction. The expression 3 — 7 makes good sense in the integers; it does not in 
the natural numbers. We denote the set of integers by Z (because “Z” is the first 
letter of the German word Zahlen, meaning numbers). 

While the integers are closed under addition, subtraction, and multiplication, they 
are not closed under division. As an example, 5/7 makes no sense in the integers. 
For this reason we create the number system known as the rational numbers. These 
are all fractions p/q, where p and q are integers and q is not equal to zero (because of 
course we are never allowed to divide by 0). The rational numbers form an attractive 
number system because they are closed under all four arithmetic operations. We 
denote the rational numbers by Q (standing for “quotient’). 

The most subtle and sophisticated number system, from our point of view, is 
the real number system. The real numbers consist of all decimal expansions, both 
terminating and nonterminating. All the rational numbers are also real numbers 
(and a rational number has a decimal expansion that is either terminating or re- 
peating). But there are also decimal expansions that are both nonterminating and 
nonrepeating. These represent the irrational numbers—which are real numbers 
that are not rational. Most of modern science and engineering is done with the 
real number system. The real numbers are not only closed under the four basic 
arithmetic operations, but they are also closed under various limiting processes 
that are important for mathematical analysis. We denote the real number system 
by R. 

In closing, we shall briefly mention the complex number system. These are num- 
bers of the form x + iy where x and y are both real (and i denotes the square root of 
—1). The complex numbers have an addition operation and a multiplication/division 
operation—and the number system is closed under both of these. The complex num- 
bers were invented to be a number system in which every polynomial equation has a 
root. But complex numbers have proved to be important in physics and engineering 
and partial differential equations. They are fundamental to modern mathematics 
and science. However, we shall see little of the complex numbers in the present 
book. The complex number system is denoted by C. 

It is worth noting that we have presented the number systems in order of sophisti- 
cation. Each new number system was created because of some lack in the preceding 
number system. For instance, the integers were created because the natural numbers 
were not closed under subtraction. The rational numbers were created because the 
integers are not closed under division. And so forth. 

Chapter 5 provides some detailed discussion of the number systems. In the 
remainder of the book we shall frequently use various number systems to illustrate 
important ideas from set theory or discrete mathematics. 
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4.2 Relations and Functions 


Let S and T be sets. A relation on S and T is a subset of S x T. If R is a relation 
then we write either (s,t) E€ R or sometimes st to indicate that (s,t) is an 
element of the relation. We will also write s ~ t when the relation being discussed 
is understood. 


EXAMPLE 4.1 
Let § = N, the natural numbers (or positive, whole numbers); and let T = R, the 
real numbers. Define a relation R by (s,t) € R if s < s/t < s + 1. For instance, 
(2,5) € R because v5 lies between 2 and 3. Also (4,17) € R because J/17 lies 
between 4 and 5. However, (5, 10) does not lie in R because 10 is not between 
Sand5+1=6. 














The domain of a relation R is the set of s € S such that there exists a t € T 
with (s,t) € R. The image of the relation is the set of t € T such that there exists 
ans € S with (s, t) € R. It is sometimes convenient to refer to the entire set T as 
the range of the relation R. Some sources use the word “codomain” rather than 
“range”. Clearly the range of a relation contains its image. 


EXAMPLE 4.2 
Let S = Nand T = N. Define a relation R on S and T by the condition (s, t) ER 
if s? < t. Observe that, for any element s € N = S, the number t = s? + 1 satisfies 
s? < t. Therefore every s € S = N is in the domain of the relation. 

Now let us think about the image. The number 1 € N = T cannot be in the 
image since there is no element s € S = N such that s? < 1. However, any element 
t € T that exceeds 1 satisfies 1* < t. So (1, t) € R. Thus the image of R is the set 
teEeEN:t>2} 














EXAMPLE 4.3 
Let S = N and T = N. Define a relation R on S and T by the condition (s, t) ER 
if s? + 7? is itself a perfect square. Then, for instance, (3,4) E€ R, (4,3)E R, 
(12, 5) € R, and (5, 12) € R. The numter 1 is not in the domain of R since there 
is no natural number ¢ such that 1? + ¢? is a perfect square (if there were, this would 
mean that there are two perfect squares that differ by 1, and that is not the case). 
The number 2 is not in the domain of R for a similar reason. Likewise, 1 and 2 are 
not in the image of R. 

In fact both the domain and image of R have infinitely many elements. This 
assertion will be explored in the exercises. 
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You Try It: Let S be the real numbers and T be the integers. Let us say thats € S$ 
is related tot € T ifs < t°. What are the domain, range, and image of this relation? 


Many interesting relations arise for which S and T are the same set. Say that 
S =T =A. Then a relation on S and T is called simply a relation on A. 


EXAMPLE 4.4 
Let Z be the integers. Let us define a relation R on Z by the condition (s, t) € R if 
s — t is divisible by 2. It is easy to see that both the domain and the image of this 
relation is Z itself. It is also worth noting that if n is any integer then the set of all 
elements related to n is either (a) the set of all even integers (if n is even) or (b) the 
set of all odd integers (if n is odd). 














Notice that the last relation created a division of the domain (=image) into two 
disjoint sets: the even integers and the odd integers. This was a special instance of 
an important type of relation that we now define. 


Definition 4.1 Let R be a relation on a set A. We say that R is an equivalence 
relation if the following properties hold: 


R is reflexive: If x € A then (x, x) € R; 
R is symmetric: If (x, y) € R then (y, x) € R; 
R is transitive: If (x, y) € R and (y, z) E€ R then (x, z) E R. 


Check for yourself that the relation described in Example 4.4 is in fact an 
equivalence relation. The most important property of equivalence relations is that 
which we indicated just before the definition and which we now enunciate formally: 


Proposition 4.1 Let R be an equivalence relation ona set A. If x € A then define 
EL ={y EA: x,y) ER} 


We call the sets E, the equivalence classes induced by the relation R. If now s 
and t are any two elements of A then either E; O E, = Ø or E; = E;. 

In summary, the set A is the pairwise disjoint union of the equivalence classes 
induced by the equivalence relation R. 


Before we illustrate this proposition, let us discuss for a moment what it means. 
Clearly every element a € A is contained in some equivalence class, for a is con- 
tained in FE, itself. The proposition tells us that the set A is in fact the pairwise 
disjoint union of these equivalence classes. We say that the equivalence classes 
partition the set A. 

For instance, in Example 4.4, the equivalence relation gives rise to two equiv- 
alence classes: the even integers € and the odd integers ©. Of course Z = E UO 
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and € N O = Ø. We say that the equivalence relation partitions the universal set Z 
into two equivalence classes. 

Notice that, in this example, if we pick any element n € E then E, = E. Likewise, 
if we pick any element m € O, then Em = O. 

The reason that an equivalence relation partitions its domain into pairwise disjoint 
equivalence classes is quite simple. For if E, and Es, are two distinct equivalence 
classes that have nontrivial intersection, suppose that x lies in each of these sets. 
Then every element of Es, is related to x, and every element of E, is related to x. By 
transitivity, it then follows that every element of E, is related to every element of 
Es, (and vice versa!). So in fact Es, and F,, are one and the same equivalence class. 


EXAMPLE 4.5 

Let A be the set of all people in the United States. If x, y € A then let us say that 
(x,y) € R if x and y have the same surname (that is, last name). Then R is an 
equivalence relation: 


1. R is reflexive since any person x has the same surname as his/her self. 


2. R is symmetric since if x has the same surname as y then y has the same 
surname as x. 

3. R is transitive since if x has the same surname as y and y has the same 
surname as z then x has the same surname as z. 


Thus 7? is an equivalence relation. The equivalence classes are all those people with 
surname Smith, all those people with surname Herkimer, and so forth. 














EXAMPLE 4.6 
Let S be the set of all residents of the United States. If x, y € S then let us say 
that x is related to y (that is, x ~ y) if x and y have at least one biological parent 
in common. It is easy to see that this relation is reflexive and symmetric. It is not 
transitive, as children of divorced parents know too well. 

To be more explicit, John and Sally could have Bill as a child. Then John and 
Sally could get divorced, John could marry Bettie, and John and Bettie could have 
a child Bobby. Thus Bill and Bobby are “related” (under the relation described in 
the last paragraph). But John and Bettie could get divorced and then Bettie could 
marry Joe and have a child Rufus with Joe. So Bobby and Rufus are related. 

We see that Bill and Bobby are related. And Bobby and Rufus are related. But 
certainly Bill and Rufus are not related, because they have no parent in common. 
So the relation is not transitive. 

What this tells us (mathematically) is that the proliferation of divorce in our 
society does not lead to well-defined families. 
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4.3 Functions 


In more elementary mathematics courses we define a function as follows: Let S$ 
and T be sets. A function f from S to T is a rule that assigns to each element of S 
a unique element of T. 

This definition is problematic. The main difficulty is the use of the words “rule” 
and “assign.” For instance, let S = T = Z. Consider 


x? if there is life as we know it on Mars 


r= 


3x —5 if there is not life as we know it on Mars 


Is this a function? Can what we see on the right be considered a rule? Do we have 
to wait until we have found life on Mars (or not!) before we can consider this a 
function? 

More significantly in practice, thinking of a function as a rule is extremely 
limiting. The functions 


fx) =x? —3x +1 
g(x) = sinx 
Inx 


h(x) = 





are inarguably given by rules. But open up your newspaper and look on the financial 
page at the graph of the Gross National Product. This is certainly the graph of a 
function, but what “rule” describes it? 

Consider now the function 


x? if —o < x < —3 


h(x) = 42x if -—3<x <2 


x+5 if 2<x<o 


For many years, until well into the twentieth century, mathematicians disagreed on 
whether this is a bona fide function. But it can certainly happen in practice that 
a physical process or a feedback mechanism can give data that is specified in a 
piecewise fashion. We need a more rigorous, and simultaneously a more flexible, 
means of understanding the concept of function. 

It is best in advanced mathematics to have a way to think about functions that 
avoids subjective words like “rule” and “assign.” This is the motivation for our next 
definition. 
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Definition 4.2 Let S and T be sets. A function f from S to T is a relation on S and 
T such that 


(1) Every s € S is in the domain of f; 
(2) If (s,t) € f and (s,u) € f then t = u. 


Of course we refer to S and T as the domain and the range, respectively, of f.! 
Condition (1) mandates that each element s of S is associated to some element of T. 
Condition (2) mandates, in a formal manner, that each element s of S is associated 
to only one member of T. Notice, however, that the definition neatly sidesteps the 
notions of “assign” or “rule.” 

We shall frequently speak of the image of a given function f from S to T. This 
just means the set that is the image of f when it is thought of as a relation. It is just 
the set of elements t € T such that there is an s € S with (s,t) € f. 


EXAMPLE 4.7 
Let S = {1, 2,3} and T = {a, b, c}. Set 


f ={d, a), 2, a), GB, b)} 


This is a function, for it satisfies the properties set down in Definition 4.2. Given 
the way that you are accustomed to writing functions in earlier courses, you might 
find it helpful to view this function as 


f(D) =a 
fQ=a 
f@G)=b 


Notice that each element 1, 2,3 of the domain is “assigned” to one and only 
one element of the range. However, the definition of function allows the possibility 
that two different elements of the domain be assigned to the same range element. 
Observe that, for this function, the image is {a, b} while the range is {a, b, c}. 














EXAMPLE 4.8 
Let S = {1, 2,3} and T = {a, b,c, d, e}. Set 


f ={d, b), 2, a), G, c), 4, a), S, b)} 





'We note, once again, that some sources use the word “codomain” instead of “range.” 
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This is a function, for it satisfies the properties set down in Definition 4.2. Notice that 
each element of the domain S is used once and only once. However, not all elements 
of the range are used. According to the definition of function, this is allowed. Note 
that the range of f is {a, b, c,d, e} while the image of f is {a, b, c}. 














Definition 4.3 Let f be a function with domain S and range T. We often write 
such a function as f : S —> T. We say that f is one-to-one or injective if whenever 
(s,t) € f and (s’,t) € f then s = s’. We sometimes refer to such a mapping as an 
injection. We also refer to such a map as univalent. 


Compare this new definition with Definition 4.2 of function. The new condition is 
similar to condition (2) for functions. But it is not the same. We are now mandating 
that no two domain elements be associated with the same range element. 


EXAMPLE 4.9 
Let S = T = R and let f be the set of all ordered pairs {(x, x?) : x € R}. We may 
also write this function as 


f:R>R 
xex? 

oras f(x) = x?. 
It is easy to verify that f satisfies the definition of function. However, both of 
the ordered pairs (—2, 4) and (2, 4) are in f [in other words f(—2) = 4 = f(2)] 
so that f is not one-to-one. 














EXAMPLE 4.10 
Let S = T = R and let f be the function f(x) = x°. Then f is strictly increasing 
as x moves from left to right. In other words, if s < t then f(s) < f(t). Hence 
f(s) # f(t). It follows that the function f is one-to-one. 














Definition 4.4 Let f be a function with domain S and range T. If for each t € T 
there is an s € S such that f(s) = t then we say that f is onto or surjective. We 
sometimes refer to such a mapping as a surjection. Notice that a function is onto 
precisely when its image equals its range. 


EXAMPLE 4.11 
Let f(x) = x? be the function from Example 4.9. Recall from that example that 
S =T = R. The point t = —1 € T has the property that there is no s € S such that 
f(s) = t. As a result, this function f is not onto. 
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EXAMPLE 4.12 
Let S =R,T = {x €R:1 <x < oo}. Letg: S > T be givenby g(x) = x? + 1. 
Then foreacht € T the numbers = +./t — 1 makes sense and lies in S. Moreover, 
g(s) = t. It follows that this function g is surjective. However, g is not injective 
[since, for example, g(—1) = g(1) = 2]. 














4.4 Combining Functions 


There are several elementary operations that allow us to combine functions in 
useful ways. In this section, and from now on, we shall (whenever possible) write 
our functions in the form 


f(x) = (formula) 


for the sake of clarity. However we must keep in mind, and we shall frequently see, 
that many functions cannot be expressed with an elegant formula. 


Definition 4.5 Let f and g be functions with the same domain S and the same 
range T. Assume that T is a set in which the indicated arithmetic operation (below) 
makes sense. Then we define 


D (f + g)@) = Ff) +ga); 

(2) (f — 8)@) = Ff) — g0); 

(3) (fF - D0) = fa) gw); 

(4) (f/g)(@) = f(x)/g (a) provided that g(x) # 0. 


Notice that in each of (1)-(4) we are defining a new function—either f + g or 
f — gor f -gor f/g—in terms of the component functions f and g. For practice, 
we shall now express (1) in the language of ordered pairs. We ask you to do likewise 
with (2), (3), (4) in Exercise 4.7. 

Let us consider part (1) in detail. Now f is a collection of ordered pairs in S$ x T 
that satisfy the conditions for a function, and so is g. The function f + g is given 
by 


f+g={(s,t+t):(s,t) Ef, (5,0) eg} 


Expressing the other combinations of f and g is quite similar, and you should 
be sure to do the corresponding exercise. 
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EXAMPLE 4.13 
Let S = T = R. Define 


f@mM= x? x and g(x) = sin(x?) 


Let us calculate f +g, f — 8, f-8, f/g. 
Now 
(f + g)(x) = Q? — x) ++ sina’) 
(f — g)(x) =? — x) — sin(x’) 
(f -g)(x) = GP — x) - [sin(x?)] 
x3 


(f/g)(x) = = provided x 4 +vkr k € {0,1,2,...} 
sin(x~) 














A more interesting, and more powerful, way to combine functions is through 
functional composition. Incidentally, in this discussion we will see the value of 
good mathematical notation. 


Definition 4.6 Let f : S —> T bea function and let g : T — U bea function. Then 
we define, for s € S, the composite function 


(go f)(s) = 8(F ($)) (4.1) 


We call g o f the composition of the functions g and f. 


Notice in this definition that the right-hand side of Eq. (4.1) always makes sense 
because of the way that we have specified the domain and range of the component 
functions f and g. In particular, we must have image f C domain g in order for 
the composition to make sense—just because we are applying g to f(s). 


EXAMPLE 4.14 

Let f :R— {x €R: x > 0} be given by f(x) = x4+2x74+6 and g:{x ER: 
x > 0} — R be given by g(x) = yx — 4. Notice that f and g fit the paradigm 
specified in the definition of composition of functions. Then 


(go f)(x) = g(f @)) 
= g(x? +x? +6) 


= Vxt+x7+6—-4 
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Notice that f o g also makes sense and is given by 


(fo g)@) = f(g@)) 
= f(vx —4) 
= [Vx — 41° + [vx — 47 +6 


It is important to understand that f o g and go f, when both make sense, will 
generally be different. 














It is a good exercise in the ideas of this chapter to express the notion of functional 
composition in the language of ordered pairs. Thus let f : S — T be a function 
and g : T — U bea function. Then f is a subset of S x T and g is a subset of 
T x U, both satisfying the two standard conditions for functions. Now g o f isa 
set of ordered pairs specified by 


gof= 
{(s,u):s E€ S,u E€ U, and dt € T such that (s, t) € f and (t, u) € 2} 


Take a moment to verify that this equation is consistent with the definition of 
functional composition that we gave earlier. 


EXAMPLE 4.15 

Let f : R —> [-1, 1] be given by f(x) = sin x? and let g : {x ER:x > 1}—> R 
be given by x — 1. We cannot consider g o f because the range of f (namely, 
the set [—1, 1]) does not lie in the domain of g. However, f o g does make sense 
because the range of g lies in the domain of f. And 


(f og)(x) =sin[x—1)°“]  fox>1 














Definition 4.7 Let S and T be sets. Let f : S — T and g : T — S. We say that 
f and g are mutually inverse provided that both (f o g)(t) = t for all t € T and 
(e o f)(s) = s for all s € S. We write g = fiorf= go!l. We refer to the func- 
tions f and g as invertible, we call g the inverse of f and f the inverse of g. 
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EXAMPLE 4.16 
Let f :R— R be given by f(s) = s? — 1 and g : R — R be given by g(t) = 
Jt + 1. Then 


(fog)t)=[vtF1] -1 
=(t+1)-1 
=t 

for all t and 

(go Ps) =V4 6-1) +1 

B 


=$ 





for all s. Thus g = f7! (or f = g7’). 











EXAMPLE 4.17 

Let g : [1, +00) — [8, +00) be given by g(x) = x? + 3x + 4. Then g(1) = 8, and 
g is strictly increasing without bound. So g is one-to-one and onto. We conclude 
that g has an inverse. One may use the quadratic formula to solve for that inverse, 


and find that g7! (x) = (—3 + /—7 + 4x)/2. 














The idea of inverse function lends itself particularly well to the notation of ordered 
pairs. For f : S —> T is inverse to g : T — S (and vice versa) provided that for 
every ordered pair (s,¢) € f there is an ordered pair (t, s) € g and conversely. 

Not every function has an inverse. For instance, let f : S — T. Suppose that 
f(s) = t and also that f(s’) = t with s Æ s’ (in other words, suppose that f is 
not one-to-one). If g : T —> S then g(f(s)) = g(t) = g(f(s’)) so it cannot be that 
both g(f(s)) = s and g(f(s’)) =’. In other words, f cannot have an inverse. We 
conclude that a function that does have an inverse must be one-to-one. 

On the other hand, suppose that t € T has the property that there is no s € S with 
f(s) = t (in other words, suppose that f is not onto). Then, in particular, it could 
not be that f (e(t)) = t for any function g : T — S. So f could not be invertible. 
We conclude that a function that does have an inverse must be onto. 


EXAMPLE 4.18 

Let f :R > {x € R : x > 0} be given by f(x) = x”. Then f is onto, but f is not 
one-to-one. It follows that f cannot have an inverse. And indeed it does not, for any 
attempt to produce an inverse function runs into the ambiguity that every positive 
number has two square roots. 
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Let f : {x E€ R: x > 0} — R be given by f(x) = x°. Then f is one-to-one but 
f is not onto. There certainly is a function g : R— {x € R: x > 0} such that 
(g o f)(x) =x forall x € {x € R: x > 0} [namely g(x) = ¥/|x]]. But there is no 
function A : R — {x € R : x > 0} such that (f oh)(x) = x forall x. 














We have established that if f : S — T has an inverse then f must be one-to- 
one and onto. The converse is true too, and we leave the details for you to verify. 
A function f : S — T that is one-to-one and onto (and therefore invertible) is 
sometimes called a set-theoretic isomorphism or a bijection. It is also common to 
use the terminology one-to-one correspondence. 


EXAMPLE 4.19 
The function f : R — R that is given by f(x) = x° is a bijection. You should 
check the details of this assertion for yourself. The inverse of this function f is the 
function g : R —> R given by g(x) = x!”. 














We leave it as an exercise for you to verify that the composition of two bijections 
(when the composition makes sense) is a bijection. 


4.5 Types of Functions 


The most elementary and easiest understood type of function is the polynomial 
function. A polynomial has the form 


p(x) = do + a,x + ax? + a;x? 4... + ayx* 


We call ao the constant coefficient, a, the linear coefficient, az the quadratic coef- 
ficient, a; the jth-degree coefficient, and a, the top-order coefficient. 

A polynomial is easy to understand because we can calculate its value at any 
point x by simply plugging in x and then multiplying and adding. A computer can 
calculate values of a polynomial very rapidly and easily. 

Another important type of function is the exponential function. For example, 
g(x) = 2” is an exponential function. This function is easy to calculate when x 
is an integer; for example, f(4) = 2f = 2 -2-2-2 = 16. It is more difficult to 
calculate when x is a noninteger; some approximation procedure would probably 
be needed (this is how your calculator effects the computation). 

You probably know that, when you put money in the bank, and it earns interest, 
then it grows according to a rule given by an exponential function. For example, 
suppose that you put $1000 in the bank and it earns interest annually at the rate of 
5%. Then we have 
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Year Amount 





After the first year $1000 - 1.05 
After the second year $1000 - (1.05)? 
After the third year $1000 - (1.05)? 


After the k th year $1000 - (1.05)* 


The most important fact about an exponential function is that it will grow faster 
than a polynomial function. We can illustrate this fact simply by doing some cal- 
culations with a calculator: 


Variablex f(x)=x? g(x) =2* 





1 1 2 

2 4 4 

3 9 8 

4 16 16 

10 100 1024 

50 2500 1.125 x 10" 
100 10000 1.26 x 10% 


It can be proved in general that if p is any polynomial and f is any expo- 
nential function with positive exponent then, for x large enough (and positive), 
f(x) > p(x). In fact we can say more: For x large enough, f (x) > 100- p(x). Or, 
for x even larger, f (x) > 1000 - p(x). As we see from the last table, the growth of 
exponential functions is dramatic. 

The flip side of this last information is that logarithmic functions—a third im- 
portant type of function—grow very slowly (in fact slower than any polynomial). A 
logarithmic function is simply the inverse of an exponential function. For example, 
h(x) = log, x is the inverse of g(x) = 2*. We illustrate the idea with another table: 


Variablex f(x) =x? g(x) =log,x 





1 1 0 

2 4 1 
4 16 2 
16 256 4 
32 1024 5 
128 16384 7 
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It can be proved in general that if p is any polynomial with positive coefficients and 
g is any logarithmic function then, for x large enough (and positive), g(x) < p(x). 
In fact we can say more: For x large enough, g(x) < (1/100) - p(x). Or, for x 
even larger, g(x) < (1/1000) - p(x). As we see from the last table, the growth of 
logarithmic functions is dramatically slow. 

Many physical phenomena—such as entropy—are best described using loga- 
rithmic functions. The intensity of an earthquake is measured using a logarithmic 
(the Richter) scale. 


Exercises 


1. Consider the relation on Z defined by (m,n) € R if m +n is even. Prove 
that this is an equivalence relation. What are the equivalence classes? 

2. Consider the relation on Z x (Z\{0}) defined by (m, n)R(m’, n’) provided 
that m - n’ = m' - n. Prove that this is an equivalence relation. Can you de- 
scribe the equivalence classes? 

3. Consider the relation defined on the cartesian plane by (x, y)R(x’, y^) if 
y = y’. Prove that this is an equivalence relation. Can you describe the 
equivalence classes? Can you pick a representative (that is, an element) 
of each equivalence class that will help to exhibit what the equivalence 
relation is? 

4. Let S be the set of all living people. Let x, y € S. Say that x is related to y 
if x and y are siblings or the same person, that is, x and y have both parents 
the same. Prove that this is an equivalence relation. What are the equivalence 
classes? 

5. Let S = {a,b,c,d} and T = {1,2,3,4,5,6, 7}. Which of the following 
relations on S x T is a function? Why? 


a. {(a, 4), (d, 3), (c, 3), (b, 2)} 
b. {(a, 5), (c, 4), (d, 3)} 
6. Which of the following functions is one-to-one? Which is onto? 
a. f: NON f(m=m+2 
b g:Z—>Z g(m) = 2m? —7 


7. Express parts (2), (3), (4) of Definition 4.5 using the language of ordered 
pairs. Imitate our discussion of part (1) in the text. 
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8. Find the domain and image of each of these relations: 
a. {(~,y)ERxRix=/y+3} 
b. {(@, B) : a is a person, B is a person, and « is the father of £} 
9. Use your calculator (or computer) to help you determine at what value 


of x the function f(x) =3**5*+7 passes the function g(x) = 1000 
x! + 100x° + 10x + 100000 in size. 


10. Discuss why there are infinitely many triples of natural number k, l, m such 
that k? +1? = m’. 


CHAPTER 5 





Number Systems 


5.1 Preliminary Remarks 


The purpose of this chapter is to illustrate the general principle of modeling. An 
axiomatic theory has no meaning, indeed no essential validity, unless there exists 
a model consisting of a collection of (mathematical) objects that actually satisfy 
the properties specified in the axioms. And constructing a model of an axiomatic 
system is insurance that the system will never lead to a contradiction (subject to the 
usual cautions about relative consistency). 

Thus our purpose is to actually construct the natural numbers, the integers, the 
rational numbers, the real numbers, the complex numbers, and so on. Again, the 
point is that there is no logical validity in saying “it seems to me that there ought 
to be some negative numbers floating around somewhere” or “it seems to me that 
the number —1 should have a square root.” One must construct number systems in 
which this is so. 

One misleading feature of the presentation in this chapter, or in any book con- 
structing the number systems from first principles, is that the more sophisticated 
number systems appear to be easier to construct than the simpler ones. This is 
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partly because of experience: after we’ve constructed four number systems then the 
fifth one follows familiar patterns. But more to the point is that any construction 
of the natural numbers, for instance, must confront fundamental issues of logic 
(such as well-ordering and induction). As a result, some nasty issues will come 
up in Sec. 5.2. The later number systems are founded on the earlier ones, so their 
presentation will be more fluid and more natural. 

These nasty issues must be considered a part of the firmament. There is no simple 
way to deal with the basic problems connected with the natural number system. 
Every student of mathematics should be exposed to these issues at least once; then 
he/she can get on with the rest of the chapter, and with issues that are more directly 
related to the everyday use of mathematics. 


5.2 The Natural Number System 


Giuseppe Peano’s axioms for the natural numbers are as follows. In this discussion, 
we will follow tradition and use the notation ’ to denote the “successor” of a natural 
number. For instance, the successor of 2 is 2’. Intuitively, the successor of n is the 
number n + 1. However, addition is something that comes later; so we formulate 
the basic properties of the natural numbers in terms of the successor function. 


5.2.1 PEANO’S AXIOMS FOR THE NATURAL NUMBERS 


PII EN. 
P2Ifn € Nthenn’ EN. 
P3 There is no natural number n such that n’ = 1. 
P4 If m and n are natural numbers and if m’ = n’ then m = n. 
P5 Let P be a property. If 
1. P(1) is true; 
2. P (k) => P(k’) for every k € N 
then P (n) is true for every n € N. 


As Suppes says in [SUP, pp. 121 ff.], these axioms for the natural numbers are 
almost universally accepted (although E. Nelson [NEL], among others, has found it 
useful to explore how to develop arithmetic without Axiom P5). They have evolved 
into their present form so that the natural numbers will satisfy those properties 
that are generally recognized as desirable. Let us briefly mention what each of the 
axioms signifies: 


P1 asserts that N contains a distinguished element that we denote by 1. 


P2 asserts that each element of N has a successor. 
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P3 asserts that 1 is not the successor of any natural number; in other words, 
1 is in a sense the “first” element of N. 


P4 asserts that if two natural numbers have the same successor then they are 
in fact the same natural number. 


P5 asserts that the method of induction is valid. 


Some obvious, and heuristically appealing, properties of the natural numbers 
can be derived rather directly from Peano’s axioms. Here is an example: 


Proposition 5.1 Let n be a natural number other than 1. Then n = m' for some 
natural number m' for some natural number m. 


Remark 5.1 This proposition makes the intuitively evident assertion that every 
natural number, except 1, is a successor. Another way of saying this is that every 
natural number except | has a predecessor. This claim is clearly not an explicit part 
of any of the five axioms. In fact the only axiom that has a hope of implying the 
proposition is the inductive axiom, as we shall now see. 


Proof of Proposition 5.1: Let P (n) be the statement “either n = 1 orn = m’ for 
some natural number m. 

Clearly P (n) is true when n = 1. 

Now suppose that the statement P (n) has been established for some natural 
number n. We wish to establish it for n’. But n’ is, by definition, a successor. So 
the statement is true. 

This completes our induction. 














This proof is misleading in its simplicity. The proof consists of little more than 
interpreting Axiom P5. Some other desirable properties of the natural numbers are 
much more difficult to achieve directly from the axioms. As an instance, to prove that 
there is no natural number lying between k and k’ is complicated. Indeed, the entire 
concept of ordering is extremely tricky. And the single most important property of 
the natural numbers (one that is essentially equivalent to the induction axiom, as we 
shall later see) is that the natural numbers are well ordered in a canonical fashion. 
So we must find an efficient method for establishing the properties of the natural 
numbers that are connected with order. 

It is generally agreed (see [SUP, p. 121 ff.]) that the best way to develop further 
properties of the natural numbers is to treat a specific model. Even that approach is 
nontrivial; so we shall only briefly sketch the construction of a model and further 
sketch its order properties so that we may discuss well ordering. The approach that 
we take is by way of the so-called finite ordinal arithmetic. 
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Let us define a model for the natural numbers as follows: 
1= {ø} 
2=1U {1} = {9, {Ø} 
3 = 2U {2} = {G, {8}, (8, {Ø} 


and, in general, 
k’ =k U {k} 


It is straightforward to verify Peano’s axioms for this model of the natu- 
ral numbers. Let us first notice that, in this model, the successor n’ of a natural 
number n is given by n’ = n U {n}. Now let us sketch the verification of the axioms. 


P1 is clear by construction, and so is P2. 

P3 is an amusing exercise in logic: if m’ = 1 then there is a set A such that 
AU {A} = 1 or AU {A} = {0}. In particular, x € A U {A} implies x € {ø}. Since 
A € A U {A}, it follows that A = Ø. But Ø is not a natural number. So 1 is not the 
successor of a natural number. 

For P4, it is convenient to invoke the concept of ordering in our model of the 
natural numbers. Say that m < n if m € n. Clearly, if m,n € N and m Æ n then 
either m € n or n € m but not both. Thus we have the usual trichotomy of a strict, 
simple order. Now suppose that m, n are natural numbers and that m’ = n'.Ifm < n 
then m € n som’ € n' and m’ <n’. That is false. Likewise we cannot have n < m. 
Thus it must be that m = n. 

We shall discuss P5 a bit later. 

Next we turn to well-ordering. We assert that our model of the natural numbers, 
with the ordering defined in our discussion of P4, has the property that if Ø Æ S CN 
then there is an element s € S such that s < t for every s At € S. It is clear from 
the trichotomy that the least element s, if it exists, is unique. We proceed in several 
steps: 

Fix a natural number m > 1 and restrict attention to Q (m), which is the set of 
natural numbers that are less than m. 


Proposition 5.2 The set Q(m) has finitely many elements. 


Proof: A natural number k is less than m if and only if k € m. But, by construction, 
m has only finitely many elements. 














Remark 5.2 As an exercise, you may wish to attempt to prove this last proposition 
directly from the Peano axioms. 
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Proposition 5.3 For each m, the set Q(m) is well ordered. 


Proof: The proof is by induction on m. 

When m = 1 there is nothing to prove. 

Assume that the assertion has been proved for m = k. Now let U be a subset of 
O(k’). There are now three possibilities: 


1. If in fact U C Q(k) then U has a least element by the inductive hypothesis. 


2. If U = {k} then U has but one element and that element, namely k, is the 
least element that we seek. 

3. The last possibility is that U contains k and some other natural numbers as 
well. But then U \ {k} C Q(k). Hence U \{k} has a least element s by the 
inductive hypothesis. Since s is automatically less than k, it follows that s is 
a least element for the entire set U. 














Now we have all our tools in place and we can prove the full result: 


Theorem 5.1 The natural numbers N are well ordered. 
Proof: LetøØ Æ S CN. Select an element m € S. There are now two possibilities: 


1. If O(m) N S = Ø then m is the least element of S that we seek. 


2. If T = O(m) N S Æ Ø then notice that if x € T and y € S\T then x < y. 
So it suffices for us to find a least element of T. But such an element exists 
by applying the preceding proposition to U = O(m) NS. 











The proof is complete. 





We next observe that, in a certain sense, the well-ordering property implies the 
induction property (Axiom P5). By this we mean the following: Do not consider 
any model of the natural numbers, but just consider any number system X satisfying 
P1—P4. Assume that every element of X except 1 has a predecessor, and in addition 
assume that this number system is well ordered. 

Now let P be a property. Assume that P(1) is true and assume the syllogism 
P(k) => P(k’). We claim that P (n) is true for every n. Suppose not. Then the set 


S = {m € X : P(m) is false} 


is nonempty. Let q be the least element of S; this number is guaranteed to exist by 
the well-ordering property. The number g cannot be 1, for we assumed that P (1) 
is true. But if q > 1 then q has a predecessor r. Since q is the least element of S 
then it cannot be that r € S. Thus P (r) must be true. But then, by our hypothesis, 
P(r‘) must be true. However, r’ = q. So P (q) is true. That is a contradiction. 
The only possible conclusion is that $ is empty. So P (k) is true for every k. 


72 Discrete Mathematics Demystified 


These remarks about well ordering implying P5 are not satisfactory from our 
point of view, because we used the inductive property to establish that every natural 
number other than 1 has a predecessor. But it is important for you to understand that 
any development of the natural numbers will result in induction and well ordering 
being closely linked. 

If you review your calculus book or other elementary texts, you will find that 
both induction and well ordering are occasionally used. But in every instance the 
author will say “These properties of the natural numbers are intuitively clear; trust 
me.” Now you can begin to understand why an elementary textbook author must 
make that choice. The truth about these topics is inexorably linked to the very 
foundations of mathematics, and is therefore both subtle and complicated. 

The remaining big idea connected with the natural numbers is addition. It can 
be proven that a satisfactory theory of addition cannot be developed from P1—P5. 
Instead, it is customary to adjoin two new axioms to our theory: 


P6 If x is a natural number then x + 1 = x’; 


P7 For any natural numbers x, y we have 
xty=(@t+y)’ 


To illustrate these ideas, let us close the section by proving that 2 + 2 = 4. (Judge 
for yourself whether the proof is as obvious as 2 + 2 = 4!) In this argument we use 
the definitions 2 = 1’,3 = 2',4 = 3’. 





24+2=2+4+1' by definition of 2 
=(Q2+1!  byP7 
= (XY by P6 
=3 by definition of 3 
=4 by definition of 4 


In fact it is even trickier to get multiplication to work in the natural number 
system. Almost the only viable method is to add even more axioms that control this 
binary operation. We can say no more about the matter here, but refer to [SUP]. 

It is not difficult to see that, with enough patience (or with induction) one could 
establish all the basic laws of arithmetic. Of course this would not be a fruitful use 
of our time. The celebrated work [WHB] treats this matter in complete detail. 

In the succeeding sections of the present book, we shall take the basic laws of 
arithmetic on the natural numbers as given. We understand that our treatment of the 
natural numbers is incomplete. We have touched on some topics, and indicated some 
constructions. But, when it comes right down to it, we are taking the natural numbers 


CHAPTER 5 Number Systems 73 Mme 


on faith. All of our future number systems (the integers, the rational numbers, the 
reals, the complexes, the quaternions) will be constructed rigorously. The somewhat 
bewildering situation before us is that the more complicated numbers systems are 
easier to construct. 


5.3 The Integers 


Now we will apply the notion of an equivalence class to construct the integers (both 
positive and negative). There is an important point of knowledge to be noted here. 
In view of Sec. 5.2, we may take the natural numbers as given. The natural numbers 
are universally accepted, and we have indicated how they may be constructed in a 
formal manner. However, the number zero and the negative numbers are a different 
matter. It was not until the fifteenth century that the concepts of zero and negative 
numbers started to take hold—for they do not correspond to explicit collections 
of objects (five fingers or ten shoes) but rather to concepts (zero books is the lack 
of books; minus four pens means that we owe someone four pens). After some 
practice we get used to negative numbers, but explaining in words what they mean 
is always a bit clumsy. 

In fact it is sobering to realize that the Italian mathematicians of the fifteenth 
and sixteenth centuries referred to negative numbers—in their formal writings—as 
“fictitious” or “absurd.” Mathematics is, in part, a subject that we must get used to. 
It took several hundred years for mankind to get used to negative numbers. 

It is much more satisfying, from the point of view of logic, to construct the 
integers from what we already have, that is, from the natural numbers. We proceed 
as follows. Let A = N x N, the set of ordered pairs of natural numbers. We define 
a relation R on A as follows: 


(a, b) is related to (a*, b*) if a + b* =a* +b 
Theorem 5.2 The relation R is an equivalence relation. 
Proof: That (a, b) is related to (a, b) follows from the trivial identity a + b = 
a + b. Hence œR is reflexive. Second, if (a, b) is related to (a*, b*) then a + b* = 
a* + b hence a* + b = a + b* (just reverse the equality) hence (a*, b*) is related 
to (a, b). So R is symmetric. 


Finally, if (a, b) is related to (a*, b*) and (a*, b*) is related to (a**, b**) then we 
have 


a+b*=a*+b and a*+b™* =a" + b* 
Adding these equations gives 


(a+b*) + (a* + b™) = (a +b) + (a™ +5") 
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Cancelling a* and b* from each side finally yields 


Thus (a, b) is related to (a**, b**). Therefore R is transitive. We conclude that R 
is an equivalence relation. 














Remark 5.3 We cheated a bit in the proof of Theorem 5.2. Since we do not yet 
have negative numbers, we therefore have not justified the process of “cancelling” 
that we used. The most rudimentary form of cancellation is Axiom P4 of the natural 
numbers. Suggest a way to use induction, together with Axiom P4, to prove that if 
a, b,c are natural numbers and if a + b = c + b then a = c. 


Now our job is to understand the equivalence classes which are induced by R. 
Let (a, b) e€ A = N x N and let [(a, b)] be the corresponding equivalence class. If 
b > a then we will denote this equivalence class by the integer b — a. For instance, 
the equivalence class [(2, 7)] will be denoted by 5. Notice that if (a*, b*) € [(a, b)] 
then a + b* = a* + b hence b* — a* = b — a as long as b > a. Therefore the nu- 
meral that we choose to represent our equivalence class is independent of which 
element of the equivalence class is used to compute it. 

If (a, b) € A and b = a then we let the symbol 0 denote the equivalence class 
[(a, b)]. Notice that if (a*, b*) is any other element of this particular [(a, b)] then 
it must be that a + b* = a* + b hence b* = a*; therefore this definition is unam- 
biguous. 

If (a,b) € A anda > b then we will denote the equivalence class [(a, b)] by the 
symbol —(a — b). For instance, we will denote the equivalence class [(7, 5)] by the 
symbol —2. Once again, if (a*, b*) is related to (a, b) then the equation a + b* = 
a* + b guarantees that our choice of symbol to represent [(a, b)] is unambiguous. 

Thus we have given our equivalence classes names, and these names look just 
like the names that we give to integers: there are positive integers, and negative 
ones, and zero. But we want to see that these objects behave like integers. (As you 
read on, use the informal mnemonic that the equivalence class [(a, b)] stands for 
the integer b — a.) 

First, do these new objects that we have constructed add correctly? Well, let 
A = [(a, b)] and C = [(c, d)] be two equivalence classes. Define their sum to be 
A+C=[(a+c,b-+d)]. We must check that this is unambiguous. If (a, b) is 
related to (a, b) and Œ, d) is related to (c, d) then of course we know that 


at+b=a@+b 


and 


CHAPTER 5 Number Systems 75 


Adding these two equations gives 
(a+c)+ +d) =@+0 + (b+d) 


hence (a + c, b + d) is related to @ +7, b + d). Thus adding two of our equiva- 
lence classes gives another equivalence class, as it should. We say that addition of 
integers is well defined. 

This point is so significant that it bears repeating. Each integer is an equivalence 
class—that is, a set. If we are going to add two integers m and n by choosing an 
element from the set m and another element from the set n, then the operation that 
we define had better be independent of the choice of elements. This is another way 
of saying that we want the sum of two equivalence classes to be another equivalence 
class. We call this the concept of “well definedness.” 


EXAMPLE 5.1 
To add 5 and 3 we first note that 5 is the equivalence class [(2, 7)] and 3 is the 
equivalence class [(2,5)]. We add them componentwise and find that the sum is 
[(2 + 2,7+5)] = [(4, 12)]. Which equivalence class is this answer? Looking back 
at our prescription for giving names to the equivalence classes, we see that this is 
the equivalence class that we called 12 — 4 or 8. So we have rediscovered the fact 
that5 +3 = 8. 

Now let us add 4 and —9. The first of these is the equivalence class [(3, 7)] and 
the second is the equivalence class [(13, 4]). The sum is therefore [(16, 11)], and 
this is the equivalence class that we call —(16 — 11) or —5. That is the answer that 
we would expect when we add 4 to —9. 

Next, we add —12 and —5. Previous experience causes us to expect the answer 
to be —17. Now —12 is the equivalence class [(19, 7)] and —5 is the equivalence 
class [(7, 2)]. The sum is [(26, 9)], which is the equivalence class that we call —17. 

Finally, we can see in practice that our method of addition is unambiguous. Let 
us redo the second example using [(6, 10)] as the equivalence class denoted by 4 
and [(15, 6)] as the equivalence class denoted by —9. Then the sum is [(21, 16)], 
and this is still the equivalence class —5, as it should be. 














Remark 5.4 What is the point of this section? Everyone knows about negative 
numbers, so why go through this abstract construction? The reason is that until one 
sees this construction, negative numbers are just imaginary objects—placeholders 
if you will—which are a useful notation but which do not exist. Now they do 
exist. They are a collection of equivalence classes of pairs of natural numbers. 
This collection is equipped with certain arithmetic operations, such as addition, 
subtraction, and multiplication. We now discuss these last two. 

If A = [(a, b)] and C = [(c, d)] are integers, then we define their difference 
to be the equivalence class [(a + d, b + ¢c)]; we denote this difference by A — C. 
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(Note that we may not use subtraction of natural numbers in our definition of 
subtraction of integers; subtraction of natural numbers is not, in general, defined.) 
The unambiguity (or well definedness) of this definition is treated in the exercises. 


EXAMPLE 5.2 
We calculate 8 — 14. Now 8 = [(1, 9)] and 14 = [G, 17)]. Therefore 


8— 14=[0 + 17,9 + 3)] = [(18, 12)] = —6 


as expected. 
As a second example, we compute (—4) — (—8). Now 


—4 — (—8) = [(6, 2)] — [03, 5)] = [(6 + 5,2 + 13)] = [(1 1, 15)] = 4 














Remark 5.5 When we first learn that (—4) — (—8) = (—4) + 8 = 4, the expla- 
nation is a bit mysterious: why is “minus a minus equal to a plus”? Now there is 
no longer any mystery: this property follows from our construction of the number 
system Z and its arithmetic operation. 


Remark 5.6 It is interesting to sort out the last example from the justification for 
the arithmetic of negative numbers that we learn in high school. Here is an example 
of that reasoning. 

It is postulated that negative numbers exist (they certainly are not constructed). 
Then it is noted that 


18 + (8 — 14) = (18 = 14) + 8 = 448 = 12 = 18 —6 = 18 + (—6) 


Identifying the far left and far right sides of the equation, we are forced to the 
conclusion that 8 — 14 = —6. 

This reasoning is perfectly correct. But it presupposes the existence of a number 
system that (a) contains negative integers and (b) obeys all the familiar laws of 
arithmetic. 

The advantage of the presentation in this section of the present book is that we 
actually construct such a number system. We do not presuppose it. The additive 
properties of negative numbers follow automatically from our construction. They 
are not derived by algebraic tricks from some numbers that we do not actually 
know exist. 


Finally we turn to multiplication. If A = [(a, b)] and C = [(c, d)] are integers 
then we define their product by the formula 


A-C=[(a-d+b-c,a-c+b-d)] 


This definition may be a surprise. Why did we not define A -C to be [(a-c, b- d)]? 
There are several reasons: first of all, the latter definition would give the wrong 
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answer; moreover, it is not unambiguous (different representatives of A and C 
would give a different answer). If you recall that we think of [(a, b)] as representing 
b — a and [(c, d)] as representing d — c then the product should be the equivalence 
class that represents (b — a) - (d — c). That is the motivation behind our definition. 

The unambiguity (or well definedness) of the given definition of multiplication 
of integers is treated in the exercises. We proceed now to an example. 


EXAMPLE 5.3 
We compute the product of —3 and —6. Now 


(—3) - (—6) = [G, 2)] -[0, 3)] = [5 -3+2 -9,5-9 +2- 3)] 
= [(33, 51)] = 18 


which is the expected answer. 
As a second example, we multiply —5 and 12. We have 


—5-12=[(7,2)]- (0, 13] =(7-134+2-1,7-14+2- 13)] 
= [(93, 33)] = —60 
Finally, we show that 0 times any integer A equals zero. Let A = [(a, b)]. Then 
0-A=[0,)]-[@,5]=[-5+1-a,1-a+1-b)] 
=[(a+b,a+b)]=0 


Remark 5.7 Notice that one of the pleasant by-products of our construction of 
the integers is that we no longer have to give artificial explanations for why the 
product of two negative numbers is a positive number or why the product of a 
negative number and a positive number is negative. These properties instead follow 
automatically from our construction. 














Remark 5.8 It is interesting to sort out the last example from the justification for 
the arithmetic of negative numbers that we learn in high school. Here is an example 
of that reasoning. 

It is postulated that negative numbers exist (they certainly are not constructed). 
Then it is noted that 


3-8=(6—3)-8=6-8—3-8 
hence 
24 = 48 —3-8 
or, using reasoning as in our last remark but one, 


-24 = —3 -8 
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Similarly, one can show that 
—48 = —6 -8 
Taking these two facts for granted, we then compute that 


(8 — 3) (8-6) =8-8+8- (—6) + (—3) -8 + (-3) - (-6) 


As aresult, 
10 = 64 — 48 — 24 + (—3) - (—6) 
or 
10 + 72 — 64 = (—3) - (—6) 
hence 


18 = (—3) - (—6) 


Again, this reasoning is perfectly correct. But it presupposes the existence of 
a number system that (a) contains negative integers and (b) obeys all the familiar 
laws of arithmetic. 

The advantage of the presentation in this section of the present book is that we 
actually construct such a number system. We do not presuppose it. The multiplica- 
tive properties of negative numbers follow automatically from our construction. 
They are not derived by algebraic tricks from some numbers that we do not actually 
know exist. 


Notice that the integers Z, as we have constructed them, contain the element 
0 = [(1, 1)]. This element is the additive identity in the sense that x + 0 = 0+ 
x = x for any integer x. Also, if y = [(a, b)] is any integer then it has an additive 
inverse —y = [(b,a)]. This means that y + (—y) = 0. As a result of these two 
facts, the integers Z form a group. We shall say more about groups in Sec. 9.4. 

Of course we will not discuss division for integers; in general division of one 
integer by another makes no sense in the universe of the integers. More will be said 
about this fact in the Exercises. 

In the rest of this book we will follow the standard mathematical custom of 
denoting the set of all integers by the symbol Z. We will write the integers not as 
equivalence classes, but in the usual way as the sequence of digits ... — 3, —2, 
—1,0,1,2,3,.... The equivalence classes are a device that we used to construct 
the integers. Now that we have them, we may as well write them in the simple, 
familiar fashion and manipulate them as usual. 
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In an exhaustive treatment of the construction of Z, we would prove that addition 
and multiplication are commutative and associative, prove the distributive law, and 
so forth. But the purpose of this section is to demonstrate modes of logical thought 
rather than to be exhaustive. We shall say more about some of the elementary 
properties of the integers in the Exercises. 


5.4 The Rational Numbers 


In this section we use the integers, together with a construction using equivalence 
classes, to build the rational numbers. Let A be the set Z x (Z\{O}). In other 
words, A is the set of ordered pairs (a, b) of integers subject to the condition that 
b £0. (Think of this ordered pair as ultimately “representing” the fraction a/b.) 
We definitely want it to be the case that certain ordered pairs represent the same 
number. For instance, 


1 
— should be the same number as : 


This motivates our equivalence relation. Declare (a, b) to be related to (a*, b*) if 
a-b* = a* -b. (Here we are thinking that the fraction a/b should equal the fraction 
a*/b* precisely when a - b* = a* - b.) 

Is this an equivalence relation? Obviously the pair (a,b) is related to itself, 
since a - b =a - b. Also the relation is symmetric: if (a, b) and (a*, b*) are pairs 
and a - b* = a*-b thena*-b=a- b*. Finally, if (a, b) is related to (a*, b*) and 
(a*, b*) is related to (a**, b**) then we have both 


a-b*=a*-b and a*-b™*=a™.b* (5.1) 


Multiplying the left sides of these two equations together and the right sides together 
gives 


(a- b*)- (a*-b™) = (a*-b)- (a™ - b*) 


If a* = 0 then it follows immediately from Eq. 5.1 that both a and a** must be 
zero. So the three pairs (a, b), (a*, b*), and (a**, b**) are equivalent and there is 
nothing to prove. So we may assume that a* 4 0. We know a priori that b* + 0; 
therefore we may cancel common terms in the last equation to obtain 


a- b™ = beg”? 


Thus (a, b) is related to (a**, b**), and our relation is transitive. (Exercise: Explain 
why it is correct to “cancel common terms” in the last step.) 

The resulting collection of equivalence classes will be called the set of rational 
numbers, and we shall denote this set with the symbol Q. 
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EXAMPLE 5.4 

The equivalence class [(4, 12)] contains all of the pairs (4, 12), (1, 3), (—2, —6). 
(Of course it contains infinitely many other pairs as well.) This equivalence 
class represents the fraction 4/12 which we sometimes also write as 1/3 or 


(—2)(—6). 














If [(a, b)] and [(c, d)] are rational numbers then we define their product to be 
the rational number 


[(a-c,b-d)] 
This is well defined (unambiguous), for the following reason. Suppose that (a, b) is 
related to (a, b) and (c, d) is related to (©, d). We would like to know that [(a, )] - 


[(c,d)] = [(a-c,b-b)] is the same equivalence class as [(a, b)] - [@, d] = 
[(a -T,b -d)]. In other words we need to know that 


(a-c): -d)= -Ù b- d). (5.2) 
But our hypothesis is that 


=a-b and c-d=c-d 


œ~? 


a- 
Multiplying together the left sides and the right sides we obtain 
(a-b)-(c-d)=(@-b)- Č- d) 
Rearranging, we have 
(a-c)-(b-d)=@-0)-(b-d) 
But this is just Eq. 5.2. So multiplication is unambiguous. 


EXAMPLE 5.5 
The product of the two rational numbers [(3, 8)] and [(—2, 5)] is 


[G - (—2), 8 - 5)] = [(—6, 40)] = [(—3, 20)] 











This is what we expect: the product of 3/8 and —2/5 is —3/20. 





If g = [(a, b)] andr = [(c, d)] are rational numbers and if r is not zero (that is, 
[(c, d)] is not the equivalence class zero—in other words, c 4 0) then we define 
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the quotient q/r to be the equivalence class 
[(ad, bc)] 


We leave it to you to check that this operation is well defined. 


EXAMPLE 5.6 
The quotient of the rational number [(4, 7)] by the rational number [(3, —2)] is, by 
definition, the rational number 


[4 - (—2),7-3)] = [(-8, 21)] 


This is what we expect: the quotient of 4/7 by —3/2 is —8/21. 














How should we add two rational numbers? We could try declaring [(a, b)] + 
[(c, d)] to be [(a + c, b + d)], but this will not work (think about the way that we 
usually add fractions). Instead we define 


[(a, b)] + [(c, d)] =la-d+b-c,b-d)] 


That this definition is well defined (unambiguous) is left for the exercises. We turn 
instead to an example. 


EXAMPLE 5.7 
The sum of the rational numbers [(3, —14)] and [(9, 4)] is given by 


[3B -4+ (-14) - 9, (-14) -4)] = [(—114, —56)] = [(57, 28)] 


This coincides with the usual way that we add fractions: 














14 4 28 


Notice that the equivalence class [(0, 1)] is the rational number that we usually 
denote by 0. It is the additive identity, for if [(a, b)] is another rational number then 


[(0, 1)] + [(a, b)] =[0-b+1-a,1-b)] = [(a, b)] 


A similar argument shows that [(0, 1)] times any rational number gives [(0, 1)] or 
0. By the same token the rational number [(1, 1)] is the multiplicative identity. We 


leave the details for you. 
Of course the concept of subtraction is really just a special case of addition [that 


is,a — bis the same as a + (—b)]. So we shall say nothing further about subtraction. 
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In practice we will write rational numbers in the traditional fashion: 


2 —-19 22 24 
e ee ey 
In mathematics it is generally not wise to write rational numbers in mixed form, 


such as 24, because the juxtaposition of two numbers could easily be mistaken for 
multiplication. Instead we would write this quantity as the improper fraction 13/5. 





Definition 5.1 A set S is called a field if it is equipped with a binary operation 
(usually called addition and denoted “+”) and a second binary operation (usually 
called multiplication and denoted “-”) such that the following axioms are satisfied: 


A1. S is closed under addition: if x, y € S then x + y € S. 
A2. Addition is commutative: if x, y € S then x + y = y +x. 
A3. Addition is associative: if x, y, z € S then 


x+ O +z)=(x+y)+z. 
A4. There exists an element, called 0, in S which is an additive identity: if 


x € Sthen0+x =x. 


A5. Each element of S has an additive inverse: if x € S then there is an 
element —x € S such that x + (—x) = 0. 


M1. S is closed under multiplication: if x, y € S thenx-y € S. 
M2. Multiplication is commutative: if x, y € S thenx-y=y-x. 
M3. Multiplication is associative: if x, y,z € S thenx-(y-z)=(x- y)- Z. 


M4. There exists an element, called 1, which is a multiplicative identity: if 
x €Sthenl-x =x. 





M5. Each nonzero element of S has a multiplicative inverse: if 0 A x € S 
then there is an element x~! € S such that x - (x7!) = 1. The element x7! is 
sometimes denoted by 1/x. 


D1. Multiplication distributes over addition: if x, y,z € Sthenx-(y+z)= 
o E n 


Eleven axioms is a lot to digest all at once, but in fact these are all familiar 
properties of addition and multiplication of rational numbers that we use every 
day: the set Q, with the usual notions of addition and multiplication (and with the 
usual additive identity 0 and multiplicative identity 1), forms a field. The integers, by 
contrast, do not: nonzero elements of Z (except 1 and —1) do not have multiplicative 
inverses in the integers. 

Let us now consider some consequences of the field axioms. 
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Theorem 5.3 Any field has the following properties: 


(1) Ifz+x =z + y thenx = y. 
(2) Ifx +z = 0 then z = —x (the additive inverse is unique). 
(3) -(-y)=y. 
(4) Ify 4O0and yx = y -z then x =z. 
(5) Ify #0and y -z = 1 then z = y`! (the multiplicative inverse is unique). 
(6) aD =x. 
(7) 0-x =0. 
(8) If x - y = 0 then either x = O or y =0. 
9) =x): y=- : y) =x: (=y). 
(10) (~x): (=y) =x: y. 


Proof: These are all familiar properties of the rationals, but now we are consid- 
ering them for an arbitrary field. We prove just a few to illustrate the logic. The 
proofs of the others are assigned as exercises. 

To prove (1) we write 


z+x=z+ y => (=2z)+ (z +x) = (=z) + Z + y) 
and now Axiom A3 yields that this implies 
[((-z) +z] +x =[(-z) +z] +y 
Next, Axiom AS yields that 
0+x=0+y 


and hence, by Axiom A4, 


To prove (7), we observe that 
0-x =(0+0)-x 
which by Axiom M2 equals 
x-(0+0) 


By Axiom D1 the last expression equals 


x-0+x-0 
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which by Axiom M2 equals 0 - x + 0- x. Thus we have derived the equation 
O-x =0-x+0-x 
Axioms A4 and A2 let us rewrite the left side as 
0O-x+0=0-x+0-x 
Finally, part (1) of the present theorem (which we have already proved) yields that 
O0=0-x 


which is the desired result. 
To prove (8), we suppose that x Æ 0. In this case x has a multiplicative inverse 
x! and we multiply both sides of our equation by this element: 


pas (x-y) = x !.0 
By Axiom M3, the left side can be rewritten and we have 
(x-x)-y=x!.0 
Next, we rewrite the right side using Axiom M2: 
(cax oy =0.x7! 
Now Axiom M5 allows us to simplify the left side: 
l-y=0. x7! 


We further simplify the left side using Axiom M4 and the right side using (7) of 
the present theorem (which we just proved) to obtain: 
y=0 


Thus we see that if x Æ 0 then y = 0. But this is logically equivalent with x = 0 
or y = 0, as we wished to prove. (If you have forgotten why these statements are 
logically equivalent, write out a truth table.) 














EXAMPLE 5.8 

The integers Z form a strictly, simply ordered set when equipped with the usual 
ordering. We can make this ordering precise by saying that x < y if y — x isa 
positive integer. For instance, 


6 <8 because 8—-6=2>0 
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Likewise, 


—5 < —1 because —1-—(—-5)=4>0 











Observe that the same ordering works on the rational numbers. 





If A is a strictly ordered set and a, b are elements then we often write a < b to 
mean that eithera = bora <b. 

When a field has an ordering which is compatible with the field operations then 
a richer structure results. 


Definition 5.2 A field F is called an ordered field if F has a strict, simple ordering < 
that satisfies the following addition properties: 

1. Ifx,y,z € F andif y < z then x +y <x +z. 

2. Ifx,y € F,x > 0, and y > 0 then x - y > 0. 


Again, these are familiar properties of the rational numbers: Q forms an ordered 
field. Some further properties of ordered fields may be proved from the axioms. 


Theorem 5.4 Any ordered field has the following properties: 


(1) Ifx > Oandz < y thenx-z <x- y. 

(2) Ifx <Oandz < y thenx-z >x- y. 

(3) If x > O then —x < 0.Ifx < 0 then —x > 0. 

(4) O< y < x then0 < 1/x < 1/y. 

(5) Ifx £0 then x? > 0. 

(6) IFO <x < y then x? < y?. 
Proof: Again we prove just a few of these statements and leave the rest as exer- 
cises. 


To prove (1), observe that the property (1) of ordered fields together with our 
hypothesis implies that 


(=z) +z < (=z) + y 


Thus, using Axiom A2, we see that y — z > 0. Since x > 0, property (2) of ordered 
fields gives 


x-(y-—z)>0 
Finally, 


x-y=x-[Q-z)4+z]=x-Q-2Z)4+x-2>04+%x-2 
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[by property (1) of ordered fields again]. In conclusion, 
X-y>X-Z 
To prove (3), begin with the equation 
O=-x+x 


Since x > 0, the right side is greater than —x. Thus 0 > —x as claimed. The proof 
of the other statement of (3) is similar. 

To prove (5), we consider two cases. If x > 0 then x^ = x - x is positive by 
property (2) of ordered fields. If x < 0 then —x > 0 [by part (3) of the present 
theorem, which we just proved] hence (—x) - (—x) > 0. But part (10) of the last 
theorem guarantees that (—x) -(—x) = x -x hencex-x > 0. 


5.5 The Real Number System 


Now that we are accustomed to the notion of equivalence classes, the construction 
of the integers and of the rational numbers seems fairly natural. In fact equivalence 
classes provide a precise language for declaring certain objects to be equal (or for 
identifying certain objects). We can now use the integers and the rationals as we 
always have done, with the added confidence that they are not simply a useful 
notation but that they have been constructed. 

We turn next to the real numbers. We saw in Sec. 5.4 that the rational number 
system is not closed under the operation of taking square roots, for example. We 
know from calculus that for many other purposes the rational numbers are inade- 
quate. It is important to work in a number system which is closed with respect to 
all the operations we shall perform. While the rationals are closed under the usual 
arithmetic operations, they are not closed under the operation of taking limits. For 
instance, the sequence of rational numbers 3, 3.1, 3.14, 3.141, ... consists of terms 
that seem to be getting closer and closer together, seem to tend to some limit, and 
yet there is no rational number which will serve as a limit (of course it turns out 
that the limit is 7—an “irrational” number). 

We will now deal with the real number system, a system which contains all 
limits of sequences of rational numbers (as well as all limits of sequences of real 
numbers!). In fact our plan will be as follows: in this section we shall discuss all the 
requisite properties of the reals. The actual construction of the reals is rather com- 
plicated, and we shall put that in an appendix (Sec. 5.5.1) at the end of the section. 


2 














Definition 5.3 Let A be an ordered set and X a subset of A. The set X is called 
bounded above if there is an element b € A such that x < b for all x € X. We call 
the element b an upper bound for the set X. 
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EXAMPLE 5.9 
Let A = Q with the usual ordering. The set X = {x € Q:2 < x < 4} is bounded 
above. For example, the number 15 is an upper bound for X. So are the numbers 12 
and 4. It is interesting to observe that no element of this particular X can actually 
be an upper bound for X. The number 4 is a good candidate, but 4 is not an element 
of X. In fact if b € X then (b + 4)/2 € X andb < (b + 4)/2, so b could not be an 
upper bound for X. 














It turns out that the most convenient way to formulate the notion that the real 
numbers have “no gaps” (that is, that all sequences which seem to be converging 
actually have something to converge to) is in terms of upper bounds. 


Definition 5.4 Let A be an ordered set and X a subset of A. An element b € A is 
called a least upper bound (or supremum) for X if b is an upper bound for X and 
there is no upper bound b* for X which is less than b. 

By its very definition, if a least upper bound exists then it is unique. 


EXAMPLE 5.10 
In the last example, we considered the set X of rational numbers strictly between 2 
and 4. We observed there that 4 is the least upper bound for X. Note that this least 
upper bound is not an element of the set X. 

The set Y = {y € Z : —9 < y < 7} has least upper bound 7. In this case, the 
least upper bound is an element of the set Y. 














Notice that we may define a lower bound for a subset of an ordered set in a 
fashion similar to that for an upper bound: / € A is a lower bound for X C A if 
x > 1 for all x € X. A greatest lower bound (or infimum) for X is then defined to 
be a lower bound £ such that there is no lower bound £* with £* > £. 


EXAMPLE 5.11 
The set X in the last two examples has lower bounds —20, 0,1, 2, for instance. 
The greatest lower bound is 2, which is not an element of the set. 

The set Y in the last example has lower bounds —53, —22, —10, —9, to name 
just a few. The number —9 is the greatest lower bound. It is an element of Y. 














EXAMPLE 5.12 
Let S = Z CR. Then S does not have an upper bound. 














The purpose that the real numbers will serve for us is as follows: they will contain 
the rationals, they will still be an ordered field, and every subset which has an upper 
bound will have a least upper bound. We formulate this property as a theorem. 
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Theorem 5.5 There exists an ordered field R which (a) contains Q and (b) has the 
property that any nonempty subset of R which has an upper bound has a least upper 
bound. 


The last property described in this theorem is called the least upper bound 
property of the real numbers. As mentioned previously, this theorem will be proved 
in Sec. 5.5.1. Now we begin to realize why it is so important to construct the number 
systems that we will use. We are endowing R with a great many properties. Why 
do we have any right to suppose that there exists a number system with all these 
properties? We must produce one! 

Let us begin to explore the richness of the real numbers. The next theorem 
states a property which is certainly not shared by the rationals (see Sec. 5.4). It is 
fundamental in its importance. 


Theorem 5.6 Let x be areal number such that x > 0. Then there is a positive real 
number y such that y? = y- y =x. 


Proof: We will use throughout this proof the fact (see part (6) of Theorem 5.4) 
that if 0 < a < b then a? <b’. 
Let 


S={séeR:s>0 and s? <x} 


Then S is not empty since x/2 € Sifx < 2and 1 € S otherwise. Also S is bounded 
above since x + 1 is an upper bound for S. By Theorem 5.5, the set S has a least 
upper bound. Call it y. Obviously 0 < min{x/2, 1} < y hence y is positive. We 
claim that y? = x. To see this, we eliminate the other two possibilities. 

If y? < x then sete = (x — y?)/[4(x + 1)]. Then £ > 0 and 


OHE =y +2 yete? 


-y x-y x-y 


X 
arb aea Ae 


y x=y x-y x 





=y +2. y 








2 
Fa 
STe A A Ay 
2 2 

2), eo x—y 
R eR 
<y +æ- y’) 
=x 


Thus y + £ € S, and y cannot be an upper bound for S. This contradiction tells us 
that y? £ x. 
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Similarly, if it were the case that y? > x then we set £ = (y? — x)/[4(x + 1)]. 
A calculation like the one we just did then shows that (y — £)? > x. Hence y — € 
is also an upper bound for S, and y is therefore not the least upper bound. This 
contradiction shows that y? # x. 

The only remaining possibility is that y? = x. 














A similar proof shows that if n is a positive integer and x a positive real number 
then there is a positive real number y such that y” = x. 

We next use the least upper bound property of the real numbers to establish two 
important qualitative properties of the real numbers: 


Theorem 5.7 The set R of real numbers satisfies the archimedean property: 


“Let a and b be positive real numbers. Then there is a natural number n such that 
na >b.” 


Theorem 5.8 The set Qofrational numbers satisfies the following density property: 


“Let c < d be real numbers. Then there is a rational number q withc < q < d.” 


Proof of Theorem 5.7: Suppose the archimedean property to be false. Then 
S = {na : n € N} has b as an upper bound. Therefore S has a finite supremum £. 
Since a > 0, —a < p. So B —a is not an upper bound for S, and there must 
be a natural number n* such that n* -a > ß — a. But then (n* + 1)a > 6, and £ 
cannot be the supremum for S. This contradiction proves the theorem. 














Proof of Theorem 5.8: Let A = d — c > 0. By the archimedean property, choose 
a positive integer N such that N -A > 1. Again the archimedean property gives a 
natural number P such that P > N -c and another Q such that Q>|—N-c|. 
Then Q > —N -c and we see that Nc falls between the integers —Q and P; 
therefore there must be an integer M between — Q and P (inclusive) such that 


M—-1<Nc<M 
Thus c < M/N. Also 
M<Nc+l1 


hence 


tier ps +i=d 
— E =e SVE = 
N N 





So M/N is a rational number lying strictly between c and d. 
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One of the most profound and useful properties of the real numbers, and one that 
is equivalent to the least upper bound property, is the intermediate value property: 


Theorem 5.9 Let f be a continuous, real-valued function with domain the interval 
[a,b]. If f(a) =a, f(b) = B, and ifa < y < B then there is a value to € (a, b) 
such that f (to) = y. 


Proof: Let 


S = {x € [a,b]: f(x) < y} 


Then S$ 4 Ø since a € S. Moreover S is bounded above by b. So tọ = sup S exists 
as a finite real number. We claim that f (to) = y. 

Clearly f (to) < y since fp is the limit of numbers at which f takes values less 
than y (we use the continuity of f here). Suppose, seeking a contradiction, that 
f (to) < y.Lete = y — f (to). By the continuity of f , wemay select 6 > O such that 
|t — to| < 6 implies that | f(t) — f(to)| < €/2. But then, for t € (to — ô, to + ô), 
f(t) < f(t.) + €/2 < y. It follows that (to — ô, to +6) C S, so to cannot be the 
supremum of S. That is a contradiction. Therefore f (to) = y. 














As an application, we prove the following special case of a theorem of Brouwer: 


Theorem 5.10 Let f : [0, 1] — [0, 1] be a continuous function. Then f has a fixed 
point, in the sense that there is a point c € [0, 1] such that f (c) =c. 


Proof: Seeking a contradiction, we suppose not. Then, in particular, f (0) > 0 and 
fC) < 1.Nowset g(x) = x — f(x). We see that e (0) = 0 — f (0) < Oand g(1) = 
1 — f (1) > 0. By the intermediate value property, there must therefore be a point 
c between 0 and 1 such that g(c) = 0. But this says that f (c) = c, as required. 














We conclude by recalling the “absolute value” notation. 
Definition 5.5 Let x be a real number. We define 


x if x>0 
|x| = 40 if x=0 


—x if x <0 


The absolute value of a real number x measures the distance of x to O. It is left as 
an exercise for you to verify the important triangle inequality: 


Ix + y| < |x| + Iyl 
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5.5.1 CONSTRUCTION OF THE REAL NUMBERS 


There are several techniques for constructing the real number system R from the 
rational number system Q. We use the method of Dedekind (Julius W. R. Dedekind, 
1831-1916) cuts because it uses a minimum of new ideas and is fairly brief. 

Keep in mind that, throughout this appendix, our universe is the system of rational 
numbers Q. We are constructing the new number system R. 


Definition 5.6 A cut is a subset C of Q with the following properties: 


1. C#øØ 
2. Ifs e Candt < sthentecC 
3. Ifs € C then there is a u e C such that u > s 


4. There is a rational number x such that c < x for all c e C 


You should think of a cut C as the set of all rational numbers to the left of some point 
in the real line (that is, it is an open half-line of rational numbers). For example, 
the set {x € Q : x? < 2} U {x € Q: x < 0} is a cut. Roughly speaking, it is the set 
of rational numbers to the left of /2. (Take care to note that ./2 does not exist as 
a rational number; so we are using a circuitous method to specify this set.) Since 
we have not constructed the real line yet, we cannot define this cut in that simple 
way; we have to make the construction more indirect. But if you consider the four 
properties of a cut, they describe a set that looks like a “rational left half-line.” 

Notice that if C is a cut and s ¢ C then any rational t > s is also not in C. Also, 
ifr € C ands ¢C then it must be thatr < s. 


Definition 5.7 If C and D are cuts then we say that C < D provided that C is a 
subset of D but C 4 D. 


Check for yourself that “<” is a strict, simple ordering on the set of all cuts. We 
note that C = D if and only if C C D and D CC. 

Now we introduce operations of addition and multiplication which will turn the 
set of all cuts into a field. 


Definition 5.8 If C and D are cuts then we define 
C+D={c+d:ceC,d €D} 


We define the cut Î to be the set of all negative rationals. 


The cut 0 will play the role of the additive identity. We are now required to check 
that field Axioms A1—A5 hold. 
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For Al, we need to see that C + D is a cut. Obviously C + D is not empty. If 
s is an element of C + D and t is a rational number less than s, write s = c +d, 
where c € C and d € D. Thent—c<s—c=déeDsot—ceD; adcecC. 
Hence t = c + (t — c) € C + D . A similar argument shows that there is anr > s 
such thatr € C + D . Finally, if x is a rational upper bound for C and y is a rational 
upper bound for D, then x + y is a rational upper bound for C + D. We conclude 
that C + D is a cut. 

Since addition of rational numbers is commutative, it follows immediately that 
addition of cuts is commutative. Associativity follows in a similar fashion. That 
takes care of A2 and A3. 

Now we show that if C is a cut then C + Î = C. For if c € C and z € 0 then 
c+z <c +0 =c hence C +Ô cC. Also, if c* € C then choose a d* € C such 
that c* < d*. Then c* — d* < 0 so c* — d* € 0. And c* = d* + (c* — d*). Hence 
C c C +0. We conclude that C + 0 = C. This is A4. 

Finally, for Axiom A5, we let C be a cut and set —C to be equal to {d € Q: 
dd* > d such that c + d* < 0 for all c € C}. If x is a rational upper bound for C 
then —x € —C so —C is not empty. It is also routine to check that —C is a cut. By 
its very definition, C + (—C) C 0. 

Further, if z € Ô then there is a z* € Ô such that z < z*. Choose an element 
c € C such that c + (z* — z) Z C (why is this possible?). Let c* € C be such that 
c <c*.Setc™ = z — c*. Then d* =z —c > c*™. We claim that ¢ + d* < 0 forall 
T e C. Suppose for the moment that this claim has been proved. Then this shows that 
c** € —C.Thenz = c* + c** € C + (—C) sothat0 c C + (—C). We then conclude 
that C + (—C) = 6, and Axiom A5 is established. 

It remains to prove the claim. So let d* be defined as above and select € € C. 
Then 


d*+¢€=2z+(-c+¢) <z4+(z*-—z)=2* <0 


Here we have used the choice of c. This establishes the claim and completes the 
proof of A5. 
Having verified the axioms for addition, we turn now to multiplication. 


Definition 5.9 If C and D are cuts then we define the product C - D as follows: 


° IfC,D > OthenC-D={qE€Q:¢q <c-dforsomec €C,d e Dwithc > 
0,d>0} 


e IFC >0,D < Ô then C - D = -—[C-(—D)] 
- IFC < Î, D > Ô then C - D = -—[(-C)-D] 
e IfC,D < ÔÎ then C - D = (—C) - (—D) 

If either C = Î or D = Î then C -D=0 
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Notice that, for convenience, we have defined multiplication of negative numbers 
just as we did in high school. The reason is that the definition that we use for 
the product of two positive numbers cannot work when one of the two factors is 
negative (check this as an exercise). 

We have said what the additive identity is in this realization of the real numbers. 
Of course the multiplicative identity is the cut corresponding to 1, or 


T=fteQ:t<}} 


We leave it to the reader to verify that if C is any cut thenl-C=C-1=C. 

It is now routine to verify that the set of all cuts, with this definition of multipli- 
cation, satisfies field Axioms M1—M5S. The proofs follow those for Al—A5 rather 
closely. 

For the distributive property, one first checks the case when all the cuts are 
positive, reducing it to the distributive property for the rationals. Then one handles 
negative cuts on a case-by-case basis. 

The two properties of an ordered field are also easily checked for the set of all 
cuts. 

We now know that the collection of all cuts forms an ordered field. Denote this 
field by the symbol R and call it the real number system. We next verify the crucial 
property of R that sets it apart from Q. 


Theorem 5.11 The ordered field R satisfies the least upper bound property. 


Proof: Let S be a subset of R which is bounded above. That is, there is a cut œ 
such that s < a for all s € S. Define 


Ss Si le 


CeS 


Then S* is clearly nonempty, and it is therefore a cut since it is a union of cuts. It 
is also clearly an upper bound for S since it contains each element of S. It remains 
to check that S* is the least upper bound for S. 

In fact if 7 < S* then 7 C S* and there is a rational number q in S*\7T. But, 
by the definition of S*, it must be that q € C for some C € S. SoC > T, and T 
cannot be an upper bound for S. Therefore S* is the least upper bound for S, as 
desired. 














We have shown that R is an ordered field which satisfies the least upper bound 
property. It remains to show that R contains (a copy of) Q in a natural way. In fact, 
if q € Q we associate to it the element (q) = C, = {x € Q: x < q}. Then Cg is 
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obviously a cut. It is also routine to check that 


g(¢a+q*)=9(q)+(q") and (q -q4*)= (4): 9") 


Therefore we see that ¢ is a ring homomorphism (see [LAN]) and hence represents 
Qas a “subfield” of R. 


5.6 The Nonstandard Real Number System 
5.6.1 THE NEED FOR NONSTANDARD NUMBERS 


Isaac Newton’s calculus was premised on the existence of certain “infinitesimal 
numbers’”—numbers that are positive, smaller than any standard real number, but 
not zero. Since limits were not understood in Newton’s time, infinitesimals served 
in their stead. But in fact it was just these infinitesimals that called the theory of 
calculus into doubt. More than a century was expended developing the theory of 
limits in order to dispel those doubts. 

Nonstandard analysis, due to Abraham Robinson (1918—1974), is a model for 
the real numbers (that is, it is a number system that satisfies the axioms for the real 
numbers that we enunciated in Sec. 5.5) that also contains infinitesimals. In a sense, 
then, Robinson’s nonstandard reals are a perfectly rigorous theory that vindicates 
Newton’s original ideas about infinitesimally small numbers. 


5.6.2 FILTERS AND ULTRAFILTERS 


One of the most standard constructions of the nonstandard real numbers involves 
putting an equivalence relation on the set of all sequences {a;} of real numbers. A 
natural algebraic construction for doing so is the u/trafilter. In fact ultrafilters are 
widely used in model theory (see the article by P. C. Eklof in [BAR]). So we will 
briefly say now what an ultrafilter is. 

Let J be a nonempty set. A filter over I is a set D C P(/) such that 


1. Øg D,I eD; 
2. IfX,Y € D then X AY e D; 
3. IfX e€eDandX CY CI thenY €D. 


In particular, a filter D over I has the finite intersection property: the intersection 
of any finite set of elements of D is nonempty. 

A filter D over J is called an ultrafilter if, for every X C J, either X € D or 
I\X e€ D. It turns out that a filter over / is an ultrafilter if and only if it is a 
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maximal filter over / (that is, there is no larger filter containing it). One can show, 
using Zorn’s lemma, that if S is a collection of subsets of J which has the finite 
intersection property then S is contained in an ultrafilter over /. 


5.6.3 A USEFUL MEASURE 


We will follow the exposition that may be found at 

http: //members.tripod.com/PhilipApps/howto.html 
See also [LIN], [CUT]. At the end, we will point out the ultrafilter that is lurking 
in the background. 


Let m be a finitely additive measure on the set N of natural numbers such that 


1. For any subset A C N, m(A) is either 0 or 1. 
2. It holds that m(N) = 1 and m(B) = 0 for any finite set B. 


That such a measure m exists is an easy exercise with the Axiom of Choice.! We 
leave the details to the interested reader. 


5.6.4 AN EQUIVALENCE RELATION 
Let 


S= {toni : an € R for all n = iea 


Define a relation ~ on S by 
{an} ~ {bn} if and only if m{n : dan = bn} = 1 
Then ~ is clearly an equivalence relation. We let R* = S/ ~ be the nonstandard 


real number system.” In other words R* is the collection of equivalence classes 
induced by this equivalence relation. 





'The rather innocent-sounding Axiom of Choice says that, given any collection of sets, there is a function 
that assigns to each set one of its elements. This axiom was first formulated by Ernst Zermelo (1871-1953) in 
1904. It turns out to harbor many mysteries, and has had a profound influence on the development of modern 
mathematics. 


?In fact this is the point where we use an ultrafilter. The set M = {A C N : m(A) = 1} is an ultrafilter. We are 
moding out by this ultrafilter. 
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We let [{a,}] denote the equivalence class containing the sequence {a}. Then 
we define some of the elementary operations on R* by 


[{an}] T {bn }] = [{an + bn}] 
[{an}] “ [bn }] = [{an : bn} 
[{an}] < Hb) iff mn: a < b} =1 


Further, we identify a standard real number b with the equivalence class 
[{b, b, b,...}]. 


5.6.5 AN EXTENSION OF THE REAL NUMBER SYSTEM 


We have seen that R* clearly contains R in a natural way. And it contains other 
elements too. We call x € R* an infinitesimal if and only if a # 0 and —a < x < a 
for every positive real number a. Forexample, [{1, 2/3, 1/3, . . .}] is an infinitesimal. 
We call y € R* an infinitary number if y > b for every real number b or y < d for 
every real number d. As an instance, [{1, 2, 3, ...}] is an infinitary number. 

It would be inappropriate in a book of this type to delve very far into the theory of 
the nonstandard reals. But at least now the reader has an idea of what the nonstandard 
real numbers are, and of how a number system could contain both the standard reals 
and also infinitesimals and infinitaries. 


5.7 The Complex Numbers 


When we first learn about the complex numbers, the most troublesome point is the 
very beginning: “Let’s pretend that the number —1 has a square root. Call it 7.” 
What gives us the right to “pretend” in this fashion? The answer is that we have no 
such right. If —1 has a square root, we should be able to construct a number system 
in which that is the case. That is what we shall do in this section. 


Definition 5.10 The system of complex numbers, denoted by the symbol C, consists 
of all ordered pairs (a, b) of real numbers (in other words, C = R x R). We add 
two complex numbers (a, b) and (a*, b*) by the formula 

(a,b) + (a*, b*) = (a+a*,b+b*) 


We multiply two complex numbers by the formula 


(a, b) - (a*, b*) = (a-a* —b-b*,a-b* + a* -b) 
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Remark 5.9 If you are puzzled by this definition of multiplication, then do not 
worry. In afew moments you will see that it gives rise to the notion of multiplication 
of complex numbers that you have seen before. 

It is interesting to note that, unlike the integers and the rational numbers, the 
new number system C is not a collection of equivalence classes. Instead, C is the 
Euclidean plane equipped with some new algebraic operations. 


EXAMPLE 5.13 
Let z = (3, —2) and w = (4, 7) be two complex numbers. Then 


z+w = (3, —2) + 4,7) = (3+4, -2+7) = (7,5). 


Also 











z-w = (3, -2)- (4,7) = [B -4 — (—2) -7,3-7 +4- (—2)] = (26, 13) 





As usual, we ought to check that addition and multiplication are commutative, 
associative, that multiplication distributes over addition, and so forth. We shall leave 
these tasks as an exercise for the reader. Instead we develop some of the crucial 
properties of our new number system. 


Theorem 5.12 The following properties hold for the number system C. 


(1) The number 1 = (1, 0) is the multiplicative identity: 1 - z = z for anyz € C. 

(2) The number 0 = (0, 0) is the additive identity: 0 + z = z for any z € C. 

(3) Each complex number z = (x, y) has an additive inverse —z = (—x, —y): 
it holds that z + (—z) = 0. 

(4) The number i = (0, 1) satisfies i -i = (—1, 0) = —1; in other words, i is a 
square root of —1. 


Proof: These are direct calculations, but it is important for us to work out these 
facts. 
First, let z = (x, y) be any complex number. Then 


1-z=(,0)-@,y)=(U-x-0-y,1-y+x-0)=(Q, y) =z 


This proves the first assertion. 
For the second, we have 


0+z=(0,0)+ &, y) = 0 +x, 0 + y) = &, y) =z 


an 
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With z as above, set —z = (—x, —y). Then 
z + (=2) = x, y) + (=x, =y) = [x + (~x), y + Cy)] = (0,0) = 0 
Finally, we calculate 


i-i=(0,1)-0,1)=@0-0-1-1,0-14+0-1)=(-1,0)=-1 











Thus, as asserted, į is a square root of —1. 





Proposition 5.4 If z € C,z 40, then there is a complex number w such that 
zw=l. 


Proof: Write z = (x, y) and set 


x mes 
w = ‘ 
x? + y? x? + y? 


Since z Æ 0, this definition makes sense. Then it is straightforward to verify that 
zew =l. 

















Thus every nonzero complex number has a multiplicative inverse. The other field 
axioms for C are easy to check. We conclude that the number system C forms a 
field. You will prove in the exercises that it is not possible to order this field. If 
a is a real number then we associate a with the complex number (a, 0). In this 
way, we can think of the real numbers as a subset of the complex numbers. In fact, 
the real field R is a subfield of the complex field C. This means that if a, 6 € R 
and (a, 0), (8, 0) are the corresponding elements in C then œ + f corresponds to 
(a + B,0) and « - 6 corresponds to (œ, 0) - (8, 0). These assertions are explored 
more thoroughly in the exercises. 

With the remarks in the preceding paragraph we can sometimes ignore the dis- 
tinction between the real numbers and the complex numbers. For example, we can 
write 


3+ i 
and understand that it means (5, 0) - (0, 1) = (0, 5). Likewise, the expression 
5-1 


can be interpreted as 5- 1 = 5 or as (5, 0) - (1, 0) = (5, 0) without any danger of 
ambiguity or misunderstanding. 
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Theorem 5.13 Every complex number can be written in the form a+ b - i, where 
a and b are real numbers. In fact, if z = (x, y) € C then 


z=x+y-i 


Proof: With the identification of real numbers as a subfield of the complex num- 
bers, we have that 


x+y-i=(x,0)+ (y,0)- (0,1) = (x, 0) + (0, y) = (x,y) =z 











as claimed. 





Now that we have constructed the complex number field, we will adhere to the 
usual custom of writing complex numbers as z = a + b - i or, more simply, a + bi. 
We call a the real part of z, denoted by Re z, and b the imaginary part of z, denoted 
Im z. In this notation, our algebraic operations become 


(a+ bi) + (a* + b*i) = (a + a*) + (b + b*)i 
and 
(a + bi) - (a* + b*i) = (a-a* — b . b*) + (a - b* + a* - b)i 


Ifz = a + bi is a complex number then we define its complex conjugate to be the 
number Z = a — bi. We record some elementary facts about the complex conjugate: 


Proposition 5.5 If z, w are complex numbers then 


Ad) z+w=740; 

(2) Z-w =Z. W; 

(3) z+Z =2. Rez; 

(4) z-—z=2.-i-Imz; 

(5) z -Z > 0, with equality holding if and only if z = 0. 


Proof: Write z = a + bi,w = c + di. Then 





z+w =(a+c)+(b+d)i 
= (a+c)—(b+d)i 
= (a — bi) + (c — di) 


=Z+wW 
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This proves (1). Assertions (2), (3), (4) are proved similarly. For (5), notice that 


z-Z3=(a+bi)-(a—bi) =a’? +b > 0 











Clearly equality holds if and only if a = b = 0. 





The expression |z| is defined to be the nonnegative square root of z - Z. In other 
words 





lz) = Vz-2= Ja tiy)- @—iy) = Vx? +? 


It is called the modulus of z and plays the same role for the complex field that 
absolute value plays for the real field: the modulus of z measures the distance of z 
to the origin. 

The modulus has the following properties. 


Proposition 5.6 [fz,w € C then 


- [21 = 121 
. lz-w| = |z|: lw]; 
. [Re z| < |z|, [Im z| < Izl; 


-lz +wl < |z| + Įwl. 


BW Ne 


Proof: Write z = a + bi, w = c + di. Then (1), (2), and (3) are immediate. For 
(4) we calculate that 
Iz +w =(c +w)-@ Fw) 

=7z:7+z-wtw-zt+tw-w 

= |z|? + 2Re (z - w) + |w]? 

< |z? + 2|z- | + Iw]? 

= Iz? + 2lz| - Iw] + Iw]? 

= (z| + Iw)? 











Taking square roots proves (4). 





Observe that if z is real then z = a + Oi and the modulus of z equals the absolute 
value of a. Likewise, if z = 0 + bi is pure imaginary then the modulus of z equals 


CHAPTER 5 Number Systems 101 MÒ 


the absolute value of b. In particular, the fourth part of the Proposition 5.6 reduces, 
in the real case, to the triangle inequality 


Ix + yl < |x| + ly! 


5.8 The Quaternions, the Cayley Numbers, 
and Beyond 


Now we shall discuss anumber system that you may have never encountered before. 
It is called the system of quaternions. Our description will be an informal one. 

Imagine R* = R x R x R x R equipped with the following operations: set i = 
(0, 1,0, 0), j = (0, 0, 1,0), k = (0, 0, 0, 1). Denote the 4-tuple (1, 0, 0, 0) by 1. 
Define the multiplication laws 


and 


and 
j-i = —k k -j= —i i-k = -j 


Of course the element 1 multiplied times any 4-tuple z is declared to be equal to z. 
In particular, 1-1 = 1. 

Finally, if z = (Z1, Z2, Z3, Z4) and w = (w1, W2, W3, w4) are 4-tuples then we 
write 


z = Z1- + z2i + z3j + z4k 
and 
w = w1 : 1 + wi + w3j + w4k 


Then z - w is defined by using the (obvious) distributive law and the rules already 
specified. For example, 


(2,0, 1, 3) - (—4, 1,0, 1) = [2-1 + j + 3k] -[-4-1+i+k] 
= (2 - (—4)) - 1 + (2) + (2k) 
+0- CA) ++G- Ð+ k) 
+ (3k - (—4)) + (8k - i) + (3k - k) 
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= -8.1 +2i+2k-—4j—-k+i-—12k+3j—-3-1 
=-—11-143i—j—11k 
= (-11, 3, -1, -11) 


Addition of two quaternions is simply performed componentwise: if z = 
(Z1, Z2, 23, Z4) and w = (w1, W2, W3, Wa) then 


Zz +w = (z1 + W1, Z2 + W2, Z3 + W3, Z4 + Wa) 


Verify for yourself that the additive identity in the quaternions is (0, 0, 0, 0). The 
multiplicative identity is 1 = (1, 0, 0, 0). 

In fact it can be checked that each nonzero element of the quaternions has 
a unique two-sided multiplicative inverse. However, since multiplication is not 
commutative, the quaternions do not form a field; instead the algebraic structure 
is called a division ring. 

Itis also possible to give RË an additive and a multiplicative structure. The multi- 
plication operation is both noncommutative and nonassociative. The resulting eight- 
dimensional algebraic object is called the Cayley numbers. We shall not present the 
details here. It is one of the great theorems of twentieth century mathematics (see 
[ADA], [BOM]) that R!, R?, R4, and R are the only Euclidean spaces that can be 
equipped with compatible addition and multiplication operations in a natural way 
(so that the algebraic operations are smooth functions of the coordinates). 

The quaternions and Cayley numbers are used in mathematical physics, in the 
representation theory of groups, and in algebraic topology. The cayley numbers 
are included in the ROM of every cell phone as part of the system of encoding 
messages. 


Exercises 


1. Let S be a set and let p : S x S —> S be a binary operation. If T C S then 
we say that T is closed under p if p:T x T — T. (As an example, let 
S = Zand T be the even integers and p be ordinary addition.) Under which 
arithmetic operations +, —, -, + is the set Q closed? Under which arithmetic 
operations +, —, -, + is the set R\Q closed? 

2. Let q be a rational number. Construct a sequence {x ;} of irrational numbers 
such that x; — q. This means that, for each € > 0, there is a positive integer 
K such that if j > K then |x; —q| < €. 

3. Let S be a set of real numbers with the property that whenever x, y € S and 
x <t < ythenr € S. Can you give a simple description of the set S? 
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. Explain why every nonzero complex number f € C has two distinct square 


roots in C. 


. An Argand diagram is a device for sketching a complex number in the plane. 


If x + iy is a complex number then we depict it in the cartesian plane as the 
point (x, y). Sketch the complex numbers 3 — 2i, 4i + 7, wi +e, —6 —i. 


. The complex number 1 = 1 + Oi has three cube roots. Use any means to find 


them, and sketch them on an Argand diagram. 


. Let p be a polynomial and assume that œ € C is aroot of p. Prove that (z — a) 


evenly divides p(z) with no remainder. 


8. Determine whether /2 + V3 is rational or irrational. 


. Find all square roots in the quaternions of the number 1 + i+ jJ. 
. Explain why subtraction in the integers is well defined. 

. Explain why multiplication in the integers is well defined. 

. Explain why division makes no sense in the integers. 

. Prove that addition in the integers is commutative. 

. Show that addition of rational numbers is well defined. 

. Show that the complex numbers cannot be ordered as a field. 


. Follow the outline in the text to show that the real numbers form a subfield 


of the complex numbers. 
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CHAPTER 6 





Counting Arguments 


Although everybody knows how to count in the sense of 
12.345 a. 


the fact is that counting is one of the most important techniques of modern mathe- 
matics. And counting can be quite sophisticated. Imagine counting the number of 
ways that one can get a straight flush in a hand of seven-card-stud poker. This is by 
no means a trivial counting problem. 

In the present chapter we shall learn some important counting techniques that 
can be used to attack a variety of problems. 


6.1 The Pigeonhole Principle 


Also known as the Dirichletscher Schubfachschluss (“Dirichlet’s drawer-shutting 
principle”), this is one of the key ideas in all of counting theory. And the idea is 
simplicity itself: 


Pigeonhole principle: Let k be a positive integer. Imagine that you are de- 
livering k + 1 letters to k mailboxes. Then it must be that some mailbox will 
receive two letters. 
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Obvious? Well, if each mailbox only received 0 or 1 letter then the total number 
of letters could not be more than 


1+14---+1 
— m 


k times 


So there could not be more than k letters, and that is a contradiction (since we as- 
sumed that we had k + 1 letters). There are many other ways to verify this significant 
principle of counting. 

Let us now illustrate the pigeonhole principle with some incisive examples. 


EXAMPLE 6.1 

Joe needs to get up at 4:00 a.m. each morning to go to work. He needs to get a pair 
of matching socks from his drawer without turning on the light and disturbing his 
wife. He knows that there are socks of three different colors, unpaired and randomly 
distributed, in the drawer. How many socks should he grab so that he can be sure 
to have a pair of the same color? 


Solution: Call the sock colors red, green, and yellow. If Joe grabs three socks 
then one could be red, one could be green, and one could be yellow. So that will 
not do. 

If he grabs four socks, then imagine that each sock is a “letter” and that there 
are three mailboxes labeled red, green, and yellow. He sticks each sock into the 
mailbox corresponding to its color. Well, there are three mailboxes and four letters, 
so some mailbox must end up with two letters. That means that two of the socks 
have the same color. 

The answer is that Joe should grab four socks. 














EXAMPLE 6.2 
There are 50 people in a room. Let us verify that there are two people in the room 
who have the same number of acquaintances in the room. 


Solution: We consider two cases: 


Case 1: There is one and only one person named Mary who has no acquaintances. 
Thus everyone else has 1 or 2 or ... or 48 acquaintances (nobody could have 49 
acquaintances because then he/she would be acquainted with Mary—impossible!). 
Thus we have 49 people (these are the “letters”), each with somewhere between 
1 and 48 acquaintances (these are the “mailboxes”). Thus some mailbox has two 
letters, meaning that two different people have the same number of acquaintances. 
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Figure 6.1 The dartboard divided into six regions. 


Case 2: In this case we assume that there is no Mary. So every one of the 50 people 
has 1 or more acquaintances. Thus everyone has either 1 or 2 or ... or 49 acquain- 
tances. That gives 50 letters going into 49 mailboxes. Some mailbox must contain 
two letters, meaning that two people have the same number of acquaintances. 














EXAMPLE 6.3 

Suppose that a standard dartboard has radius 10 inches. We throw seven darts at 
the dartboard. Why is it true that two of the darts will be distance at most 10 inches 
apart? 


Solution: Examine Fig. 6.1. It shows the dartboard divided into six regions. But 
there are seven darts. By the pigeonhole principle, two of the darts must land in the 
same region. Refer to Fig. 6.2. Since the dartboard has radius 10 inches, no two 
points of the region are more distant than 10 inches. So these two darts satisfy our 
criterion. 














EXAMPLE 6.4 
Verify that, at any given moment in New York City, there are two people with the 
same number of hairs on their heads. 


Solution: Any person has at most 900 hairs per square inch on his/her dome. No 
head has diameter greater than 20 inches. It is easy to estimate that the cranium 
has surface area at most 47r? (that is, the surface area of a sphere of radius r) 
hence at most 5000 square inches. So nobody has more that 5 million hairs on his/ 
her head. 
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Figure 6.2 Two darts in the same region. 


But New York City has about 9 million people. Thus we have 9 million letters 
with at most 5 million mailboxes. It follows that some mailbox will contain two 
letters, so that two people will have the same number of hairs. 














You Try It: Select 55 integers at random between 1 and 100 inclusive. Demon- 
strate that two of these selected numbers will differ by exactly 9. 


You Try It: There are 9 people seated in a row of 12 chairs. Show that three 
consecutive chairs must be occupied. 


6.2 Orders and Permutations 


Suppose that we have n objects, as in Fig. 6.3. In how many different orders can 
they be presented? The technical language for this question is, “How many different 
permutations are there of n objects?” Here a permutation is simply a reordering of 
the objects. 


Figure 6.3 An array of n objects. 
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A B C 


Figure 6.4 An array of three objects. 


As a simple example, take n to be 3. Refer to Fig. 6.4. In how many different 
orders can we present these objects? Fig. 6.5 shows the different ways. 

We see that there are a total of six different ways to permute three objects. Contrast 
this situation with two objects—there are only two different ways to permute two 
objects ({A, B} or {B, A}). In fact the number of permutations of n objects grows 
rather rapidly with n. 

Now let us return to n objects. In how many different ways can we order them? 
Look at the first position. We can put any of the n objects in that first position, 
so there are n possibilities for the first position. Next go to the second position. 


B 




















-0-0-0-0 
©0-0 0 0-0- © 





Figure 6.5 All the different orders for three objects. 
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One object is used up, so there are n — 1 objects remaining. Any one of those n — 1 
objects can go into the second position. 

Now examine the third position. There are n — 2 objects remaining (since two of 
them are used up). So any of those n — 2 can go in the third position. And so forth. 
We see that there are three possibilities for the (n — 2)th position, two possibilities 
for the (n — 1)th position, and one possibility for the nth position. 

To count the total number of possible orderings, we multiply together all these 
counts: 


number of permutations of n objects = n - (n —1)-(n—2)----- 3-2-1 


This last expression is so important, and so pervasive, in mathematics that we give 
it a name. It is called n factorial, and is written n!. 


EXAMPLE 6.5 
In how many different ways can we order five objects? 


Solution: The number of permutations of five objects is 


5!=5-4-3-2-1=120 











There are 120 different permutations of 5 objects. 





6.3 Choosing and the Binomial Coefficients 


Suppose that you have n objects and you are going to choose k of them (where 
0 < k <n). In how many different ways can you do this? 

Just to illustrate the idea, take n = 3 and k = 2—see Fig. 6.6. For convenience 
we have labeled the objects A, B, C. The different ways that we can choose two 
from among the three are 


{A, B} {A, C} {B,C} 


There are no other possibilities. 

It is of interest to try to analyze this problem in general. So imagine now that 
we have n objects as shown in Fig. 6.7. We are going to select k of them, with 
0 <k <n. We may note in advance that there is only 1 way to select 0 objects 
from among n—you just do not select any! So we may assume that k is at least 1. 
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0 o @ 


Figure 6.6 Three objects from which to choose two. 


We select the first object. There are n different ways to do this—either you select 
the first one a1, or the second one a2, or ... or the last one ay. 

Now let us think about selecting the second object. One object has already been 
selected, so there are n — 1 objects left. And we may select any one of them with 
no restriction. So there are n — 1 ways to select the second object. See Fig. 6.8. 

Now the pattern emerges. When we go to select the third object, there are n — 2 
choices remaining. So there are n — 2 ways to select the third object. 

And on it goes. There are n — 3 ways to select the fourth object, n — 4 ways to 
select the fifth object, ..., n — k + 1 ways to select the kth object. 

Altogether then we have 


n-(n—1)-(n—2)----M—k+1) 


ways to select k objects from among n total. But we have overlooked an important 
fact—trefer to the last section. We could have selected those k objects in any of k! 
different orders. So in fact the number of ways to select k objects from among n is 


n-(n—1)-(n—2)---@—k+1) 
k! 





This is a very common expression in mathematics, and we give it the name n 
choose k. We denote this quantity by (x). In fact it is customary to write it as 


n\ _ n! 
k) (n-=-k)!k! 
EXAMPLE 6.6 


In how many different ways can we choose two objects from among five? 


ay ag a3 an 


Figure 6.7 Selecting k objects from among n. 
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ay ag a3 an 
Figure 6.8 One object selected and n — 1 remaining. 
Solution: The answer is immediate: 


5 5! 120 
2) 6-92! 6-2 


In fact if the five objects are a, b, c, d, and e then the five possible choices of two 
are: 





{a,b} {a,c} {a,d} {a,e} {b,c} 
{b,d} {b,e} {c,d} {ce} {d;e} 














EXAMPLE 6.7 
How many different 5-card poker hands are there in a standard 52-card deck of 
cards? 


Solution: The answer is 


Pye zt = 2,598, 960 
5) 62-515) 77” 














EXAMPLE 6.8 
In a five-card poker hand, what are the chances of having three-of-a-kind? 


Solution: For the three matching cards, the first card that we select can be any- 
thing. So there are 52 possibilities. But the next one must match it. So there are only 
three choices for the second card (remember that there are four of each kind of card 
in the deck). And for the third card there are then just two possibilities. The other 
two cards can be completely random, so there are 49 and 48 possibilities for those 
two cards. Thus the total number of ways to have a hand with three-of-a-kind is 


52.3. 2.49.48 


= 122, 304 
6 
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Note the little surprise: we have divided by 6 since we must divide out all the 
different possible orders of the same set of three matching cards; the number of 
permutations of three objects is 3! = 6. 

Now the odds of getting three-of-a-kind is the ratio 


122304 


eee S/O 04 24 
3598060 = 00470588 


(using the count of all possible hands from the previous example) or slightly less 
than 1 in 20. 














You Try It: What are the odds of getting two pair in a standard five-card poker 
hand? 


You Try It: What are the odds of getting four-of-a-kind in a standard five-card 
poker hand? 


You Try It: You have a pot of beads. The beads are all identical in size and shape, 
but come in 2 different colors. You wish to make a beaded necklace consisting of 
10 beads. How many different necklaces could you make? Of course the order of 
the beads will be important: beads in the order black-white-black-white does not 
give the same necklace as black-black-white-white. Note also that two necklaces 
are equivalent, and count as just one necklace, if a rotation of one gives the other. 
[After you have solved this problem, try replacing “10” with n and “2” with k and 
solve it again. ] 


6.4 Other Counting Arguments 


It will be useful for this section and the material that follows to have standard 
mathematical summation notation at our disposal. The expression 


N 
a; (6.1) 
j=l 

is used to denote 


dı +42 ++: + an 


Observe that the symbol }> is the Greek letter sigma, which is a cognate of our 
roman S. We read the Eq. (6.1) as meaning that we sum a; as the index j ranges 
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from 1 to N. We will also write 


to indicate that b; should be summed from 5 to N. In fact we allow any integral 
lower index M and any upper index N; the custom is that M should be less than or 
equal to N. 

Sometimes we also write 


to indicate that we are summing infinitely many terms. This idea will be explored 
in Chap. 13. 


EXAMPLE 6.9 

Draw a planar grid that is 31 squares wide and 17 squares high. How many different 
nontrivial rectangles can be drawn, using the lines of the grid to determine the 
boundaries? See Fig. 6.9. (Here “nontrivial” means that the rectangle has positive 
width and positive height.) 


Solution: We need a cogent method for counting the rectangles. Note that any 
rectangle is uniquely specified by the location of its lower left-hand corner, its 
length, and its width. 
























































































































































Figure 6.9 A 31 x 17 grid. 
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We think of the lower left point of the grid in Fig. 6.9 as the origin and locate 
points in the usual cartesian fashion (with the side of a square in the grid being 
a unit). 

How many rectangles can have their lower left corner at the origin? Well, there 
are 31 possible widths, from 1 to 31, and 17 possible heights, from 1 to 17. That is 
a total of 31 x 17 = 527 possible rectangles with lower left corner at 0. 

This is a good start, but extrapolation from this beginning will constitute an 
awfully tedious manner of enumerating all the possible rectangles. We will now be 
a bit more analytical. Suppose that we consider rectangles with lower left corner 
at (j,k), withO < j < 30 and0 < k < 16. There are (31 — j) possible widths for 
such rectangles and (17 — k) possible heights. Thus, altogether, there are (31 — 
J) x (17 — k) rectangles with lower left corner at (j, k). Given the range of j and 
k, the total number of all possible nontrivial rectangles is 


30 16 


=% 9 61-j)x(17-k) 


j=0 k=0 


It is useful to expand the summands using the distributive law and then to regroup. 
We have 


30 16 
=X X [527 - 17; — 31k + jk] 
j=0 k=0 
This, in turn, equals 
30 16 30 16 
> 5 - EYL u+ PY =a. 31- 17] 
j=0 k=0 =0 k=0 J=0 k= 


30 16 30 16 
—17-17-S°j —31-31- okt | pd 
j=0 k=0 j=0 k=0 


Now we may use Gauss’s formula from Sec. 2.4 to evaluate each of the sums in the 
last formula. We obtain 


S = 277729 — 289 - 465 — 961 - 136 + 465 - 136 
= 277729 — 134385 — 130696 + 63240 
= 75888 
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In Chap. 1 we saw how to use the method of induction to calculate certain finite 
sums, such as 1 +2 + 3 + - -- k, in closed form. Now we look at another type of 
sum, known as a geometric sum. 

A geometric sum is a sum of powers of a fixed number. For example, 


14+343°4+3?4---43! 


is a finite geometric sum. Also 


ies apn ye 
2° (2 2 


is an infinite geometric sum. Notice in each case that the first term of the sum is 1, 
and that is the zeroeth power of the fixed number A (in the first instance 3 and in 
the second instance 1/2). 


EXAMPLE 6.10 
Let à be a real number and k a positive integer. Calculate the sum of the geometric 
series 


SH=L+A+V4---+0 


Solution: The key is to note that multiplying S by à does not change it very much. 
Indeed 


AS =A HAHA 4.--4 941 


The sums S and AS differ only in the presence of 1 in the first of these and the 
presence of A‘*! in the second. In other words 


S-1=AS —y'! 
That is, 
Siar -1 
We finally write this as 


aktl_ 4 
S = —— 
A-1 














An example of what the solution of the last problem tells us is as follows: Sup- 
pose that we want to know explicitly the value of the sum S = 1 + (1/3) + (1/3)? 
+- <- + (1/3)!. It would be quite tedious to add all these numbers up by hand 
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(or even using a calculator). But this question fits the paradigm of the geometric 
series, with A = 1/3 andk = 100. Thus 


ga ales ee i iD Wag 
= (/3)-1 2 3 
Now a calculator may be used to see that the value of this last expression is about 
1.5 — 9.702 - 107%. 


Sometimes it is convenient, when —1 < A < 1, to reason as follows: for k € 
{1,2,3,...} we set 





Ses ltatare..- nk 
We know, from the last example, that 


1 —)1 
S, = ————_ 6.2 
k ii (6.2) 


Now we could ask what happens if, instead of adding just finitely many powers of 
A, we add all powers of à. This would correspond (in a sense that you will learn 
about more precisely when we get to Chaps. 12 and 13) to letting k tend to infinity 
in Eq. (6.2). 

The result is that the sum S = 1 + à + å? + -- - of all nonnegative powers of 
A is obtained by asking what happens to the right-hand side of Eq. (6.2) when 
k becomes large without bound. Since |A| < 1, it is plausible that A*t! becomes 
smaller and smaller, in fact tends to 0, as k increases without bound. In other words, 
Sk > 1/(1 — A). We write 


ye ee (6.3) 
= io 


This is a variant of the standard mathematical notation for summation. The symbol 
>= denotes the summation process. The lower limit means that we begin our sum- 
ming with the exponent j equaling 0 and the upper limit having no bound (in other 
words, we sum all powers of i). 

Here is an illustrative example: what is the sum 


ie oe ie ie 
ONG 2 


equal to? Draw a picture of the interval [0, 2] on your scratch pad. The sum of the 
first two terms is 3/2. Add one additional term and you cover half the remaining 
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distance to 2. Add the fourth term and you again cover half the remaining distance 
to 2. In fact each additional term repeats this key property. It is plausible to suppose 
that the entire sum equals 2. 

In fact our new formula makes this supposition concrete: 


S1\i 1 a 
Z) ~ 1=(1/2) © 


j=0 





6.5 Generating Functions 


Now we shall learn the powerful technique of generating functions. In fact we shall 
build on what has gone before. We will use our new ideas about geometric series, 
in a very simple form, in the next problem. 


EXAMPLE 6.11 

The Fibonacci sequence is famous in mathematics, indeed in all of science. The 
Fibonacci sequence describes the spacing of leaves on a vine, the turns of a conch 
shell, and many other natural phenomena. 

The sequence is formed in the following way: the first two terms are each equal 
to 1. The next term is obtained by adding the preceding two: thus the third term 
equals 2. The next term (the fourth) is obtained by adding the preceding two: 
1 +2 = 3. The next term is obtained by adding the preceding two: 2 + 3 = 5. In 
fact the first 10 terms of the Fibonacci sequence are 


1, 1, 2, 3,5, 8, 13, 21, 34, 55 
We denote the jth term of the Fibonacci sequence by aj. Thus 
ag=l1la=1 @a=2 a=3 a=5 


and so forth. 
Show that the following formula for the Fibonacci sequence is valid: 


OREI 
V5 


Solution: We shall use the method of generating functions, a powerful technique 
that is used throughout the mathematical sciences. 

We write F(x) = ao +aix + aox?+---. Here the aj’s are the terms of the 
Fibonacci sequence and the letter x denotes an unspecified variable. What is curious 


aj 
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here is that we do not care about what x is. It is simply an unspecified variable. We 
shall not solve for x. We intend to manipulate the function F in such a fashion that 
we will be able to solve for the coefficients a ;. Just think of F (x) as a polynomial 
with a lot of coefficients. 

Notice that 


xF (x) = ax +ayx? + anx* + azxt +--+: 
and 
x? F (x) = aox? + aix? + axt +a3x°+--- 
Thus, grouping like powers of x, we see that 


F(x) — xF (x) — x7?F(x) 
= ay + (a, — ao)x + (a2 — ay — 9) x” 


+ (a3 — a — a1)x? + (a4 — a3 — ay) x4 + -+ 


But the basic property that defines the Fibonacci sequence is that a2 — a; — ao = 
0, a3 — a — a; = 0, and so on. Thus our equation simplifies drastically to 


F(x) — xF (x) —x°F (x) = ao + (a; — a) x 
We also know that a9 = a; = 1. Thus the equation becomes 
(l—x —x?)F(x) =1 
or 


1 
F = —___ 
(w) 1—x —x? 


(6.4) 


It is convenient to factor the denominator as follows: 


1 
[= Se) [ae 


[just simplify the right-hand side to see that it equals Eq. (6.4)]. 
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A little more algebraic manipulation yields that 


54/5 1 5—45 1 
F(x) = 7 + z 
10 1+ 775% 10 ee arrests 





Now we want to apply the formula in Eq. (6.2) from Sec. 6.4 to each of the 
fractions in brackets ([ ]). For the first fraction, we think of — a cha as à. Thus the 
first expression in brackets equals 








All told, we find that 


ee? rer aera ered] 











Grouping terms with like powers of x, we finally conclude that 


r= 5? e (-2,) +52 a) v 


But we began our solution of this problem with the formula 








F(x) =a + ax +ax? +- 


The two different formulas for F (x) must agree. In particular, the coefficients of 
the different powers of x must match up. We conclude that 


aa rer, a Grey) 
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We rewrite 


5+v5_ 1 14V5 5-5 1 
10 5 2 10 J5 2 





and 








Hae 2 “Pas. oe 
Making these four substitutions into our formula for a ;, and doing a few algebraic 
simplifications, yields 
(SE a Cey 
2 2 


V5 





4j = 





as desired. 











Notice how, in this last example, we combined F, x F , and x°F so that important 
cancellations would take place. That is how we used the special properties of the 
Fibonacci sequence. In other problems, such as those in the next section, you will 
need to use different combinations, with possibly different coefficients, that are 
tailored to each specific problem. 


6.6 A Few Words About Recursion Relations 


Let {a;} be a sequence, or a list of numbers. More explicitly, the sequence is 
d0, d1, d2, ... 


where the list never stops. It is frequently the case that we will have a rule that 
tells us the value of the jth element of the list in terms of some of the previous 
elements. This situation is called a recursion. The method of generating functions 
can sometimes be used to good effect to solve recursions. We shall illustrate the 
idea here with some examples. 


EXAMPLE 6.12 
A sequence is defined by the rule ag = 4, a; = —1, and aj = —aj-1 + 2a j-2. Use 
the method of generating functions to find a formula for aj. 
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Remark 6.1 Notice that the recursion rule says in particular that 


ay = —a, + 2a 
As a result, a2 = 9. Similarly, 
a3 = —a + 2a; 
It follows that az = —11. 
This is how recursions work. 
Solution: We write F(x) = aọ + a,x + ax? +---. Here the ajs are the terms 


of the unknown sequence and the letter x denotes an unspecified variable. 
Notice that 


XF (x) = aox + ajx? + ax? + ayx* fee 
and 
x°F (x) = aox? + ajx? + axt + a3x° fee 
Thus, grouping like powers of x, we see that 
F(x) +xF (x) - 2x? F (x) 


= ay + (ay + ao)x + (a2 + a; — 2ap)x? 
+ (a3 +a — 2a) x? +--+- 


But the basic property that defines our sequence is that a2 = —aı + 2ao, a3 = 
—a + 2a;, and so on. Thus our equation simplifies drastically to 


F(x) + xF (x) — 2x?F (x) = ao + (aj + ao)x 
We also know that aọ = 4 and a; = —1. Thus the equation becomes 
(+x —2x*)F(x) =4+43x 


or 


4+ 3x 
F(x) = ————_~ ; 
(x) idx 2 (6.5) 
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It is convenient to factor the denominator as follows: 


4+4 3x 
(—2x — 1). (x — 1) 





F(x) = 


[just simplify the right-hand side to see that it equals Eq. (6.5)]. 
A little more algebraic manipulation yields that 


roy=(-5). 1 ey 1 
3/ —2x-1 3/ x-1 
5 1 ie Jd 
= (3) as +G) 


Now we want to apply the formula in Eq. (6.2) from Sec. 6.4 to each of the fractions 
here. For the first fraction, we think of —2x as à. Thus the first fractional expression 
equals 








3 (—2x)/ 
j=0 


Likewise the second fractional expression equals 
OO 
ae 
j=0 


All told, we find that 


5 ai ee 
FO) = 3) (-2x)) +3) x! 
j=0 j=0 


Grouping terms with like powers of x, we finally conclude that 
oO Ts ee T. : 
Res -Dip Be 
(x) DE y+ aE 
j=0 
But we began our solution of this problem with the formula 
OO 


F(x) = X aji 


j=0 
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The two different formulas for F (x) must agree. In particular, the coefficients of 
the different powers of x must match up. We conclude that 


5 dE 
aj = 5(-2 +5 





That is the solution to our recursion problem. 











You Try It: A sequence is defined by the rule ag = 2, a; = 1, anda; = 3a j-1 — 
aj—2. Use the method of generating functions to find a formula for aj. 


You Try It: A sequence is defined by the rule aọ = 0, ay = —1, anda; = 3a j-1 — 
2a j—2. Use the method of generating functions to find a formula for aj. 


6.7 Probability 


In Sec. 6.3 we have alluded to certain questions of probability theory. Now we take 
a moment to treat the subject a bit more formally and precisely. 

If an event E has finitely many possible outcomes 01, 02, ..., Ox, then we assign 
a positive number p; to each outcome o; to indicate the likelihood that outcome 
will actually occur. We mandate in advance that 


k 
P=) 
j=l 


indicating that 1 describes the totality of all outcomes. 

If we are flipping a coin (that is the event E), then there are two possible outcomes: 
heads and tails. Observation of many flips (or just common sense) teaches us that 
the two outcomes are equally likely. Therefore we assign the number p, = 1/2 to 
the outcome “heads” and the number p; = 1/2 to the outcome “tails.” Notice that 
1/2 + 1/2 = 1, as we have mandated. We call p, the probability that heads will 
occur and p; the probability that tails will occur. 

It is common to express probabilities in terms of percentage. We might also say 
(referring to the last paragraph) that there is a50% probability that heads will occur 
and a 50% probability that tails will occur. 

Now let us look at a different situation. I hold a red ball and a blue ball. So do 
you. Each of us will randomly put one of the balls into a box on the table. What are 
the probabilities of the different outcomes? The different possibilities are BB, RR, 
and RB (here R stands for “red” and B stands for “blue”). A naive analysis might 
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cause one to guess that each of these three outcomes has the same probability.’ But 
the naive analysis is wrong. Here is why: 


e The outcome BB can only occur if each of us contributes a blue ball. So it can 
only happen one way. 


e The outcome RR can only occur if each of us contributes a red ball. So it can 
only happen one way. 


e The outcome RB can occur if I contribute a red ball and you contribute a blue, 
or if I contribute a blue ball and you contribute a red. So it can happen in two 
different ways. 


Thus we see that the correct assignment of probabilities is pgg = 1/4, prr = 1/4, 
Dre = 1/2. We have chosen these numbers to meet the following criteria: 


e The sum of all the probabilities should be 1. 
e The probability of BB and RR should be equal. 
e The probability of RB should be double that of BB or RR. 


EXAMPLE 6.13 
A girl flips a fair coin five times. What is the probability that precisely three of the 
flips will come up heads? 


Solution: Each flip has two possible outcomes. So the total number of possible 
outcomes for five flips is 


2-2-2-2-2=32 


Now the number of ways that three head flips can occur is the number of ways 
that three objects can be chosen from five. This is 


5 5! 
a= 210 
6 31.2! 


We conclude that the answer to the question is 


a 203195 
Ma S 

















'In fact this is what genetecists in the early twentieth century thought—where for them B is a dominant gene 
and R is a recessive gene. It was eminent mathematician G. H. Hardy who noted the error of their ways. The 
resulting published paper led to what is now called the Hardy-Weinberg law in genetics. 
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You Try It: Calculate the probability that the girl in the last example will get zero 
heads, one head, two heads, four heads, and five heads. Add up all these results 
(including the result for three heads from the last example). The answer should of 
course be 1, since these are all the possible outcomes. 


EXAMPLE 6.14 

Eight slips of paper with the letters A, B,C, D, E F,G, and H written on them 
are placed into a bin. The eight slips are drawn one by one from the bin. What is 
the probability that the first four to come out are A,C,E, and H (in some order)? 


Solution: This problem is much less exciting than it sounds. After we choose the 
first four slips, it does not matter what we do. We could burn the others, or go drink 
coffee, or enroll in truck driving school. And the statement of the problem rules out 
the order in which the slips are drawn. Stripping away the language, we see that we 
are randomly selecting four objects from among eight. We want to know whether 
a particular four, in any order, will be the ones that we select. 

The number of different ways to choose four objects from among eight is 


8 8! 8.7-6.5 

4 41-41 4.3.2.1 
Of these different subsets of four, only one will be the set {A, C, E, H}. Thus the 
probability of the first four slips being the ones that we want will be 1/70. 

















EXAMPLE 6.15 

Suppose that you write 37 letters and then you address 37 envelopes to go with 
them. Closing your eyes, you randomly stuff one letter into each envelope. What is 
the probability that just one envelope contains the wrong letter? 


Solution: Say that the envelopes are numbered 1-37 and the letters are numbered 
1-37. If letters 1 through 36 go into envelopes 1—36 then what remains are letter 
37 and envelope 37. So that last letter is forced to go into the correct envelope. 

Of course there is nothing special about the numbering used in the last paragraph. 
It just helped us to make a simple point: it is impossible to have just one letter in 
the wrong envelope. If one letter is in the wrong envelope then at least two letters 
are in the wrong envelope. 

Thus the answer to our problem is that the probability is zero. 














You Try It: You have an urn with 100 black marbles and 100 white marbles. You 
close your eyes and grab five marbles at random. What is the probability that at 
least three of them are black? 
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EXAMPLE 6.16 

Suppose that you have 37 envelopes and you address 37 letters to go with them. 
Closing your eyes, you randomly stuff one letter into each envelope. What is the 
probability that precisely two letters are in the wrong envelopes and all others in 
the correct envelope? 


Solution: If just two letters are to be in the wrong envelope then they will have 
to be switched; for instance, letter 5 could go into envelope 19 and letter 19 into 
envelope 5. Thus the number of different ways that we can get just two letters in 
the wrong envelopes is just the same as the number of different ways that we can 
choose two letters from among 37. (All of the other 35 letters must go into their 
correct envelopes, so there is no choice involved for those 35.) This number is 


37 37! 37-36 
n=(3)=2= gi e 








Now if we imagine the envelopes, in their correct order (numbers 1-37), lying 
in a row on the table, then a random distribution of letters among the envelopes 
just corresponds to a random ordering of the letters. Thus the number of different 
possible ways to distribute 37 letters among 37 envelopes is 37! (a very large 
number). In conclusion, the probability that all letters but two will be in the correct 
envelopes is 


666 
P = — ~ 4.86 . 107“ 
37! 














6.8 Pascal's Triangle 


The idea of what we now call Pascal's triangle actually goes back to Yanghui 
in about the twelfth century in China (and the Chinese call the object Yanghui’s 
triangle). But it was Blaise Pascal (1623—1662) who really developed the concept 
and showed its importance and context in modern mathematics. 

Pascal’s triangle is a triangle formed according to the following precept (see 
Fig. 6.10). 

The rule for forming Pascal’s triangle is this: 


e A 1 goes at the top vertex. 


e Each term in each subsequent row is formed by adding together the two 
numbers that are to the upper left and upper right of the given term. 
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1 1 
1 2 1 
1 3 3 1 
1 4 6 4 1 


Figure 6.10 Pascal’s triangle. 


It is convenient in our discussion to refer to the very top row of Pascal’s triangle 
(with a single digit 1 in it) as the zeroth row. The next row is the first row. And so 
forth. So the zeroth row has one element, the first row has two elements, the second 
row has three elements, and so forth. 

Thus, in the first row, the leftmost term has nothing to its upper left and a 1 to 
its upper right. The sum of these is 1. So the leftmost term in the first row is 1. 
Likewise the rightmost term in the first row is 1. 

For the second row, the leftmost term has nothing to its upper left and a 1 to its 
upper right. So this new term is 1. Likewise the rightmost term in the second row 
is 1. But the middle term in the second row has a 1 to its upper left and a 1 to its 
upper right. Therefore this middle term equals 1 + 1 = 2. That is what we see in 
Pascal’s triangle. 

For the third row, we see as usual that the leftmost term and the rightmost terms 
are both 1 (in fact this property holds in all rows). But the second term in the third 
row has a | to its upper left and a 2 to its upper right. Therefore this second terms 
is equal to 3. Likewise the third term is equal to 3. 

The rest of Pascal’s triangle is calculated similarly. The triangle is obviously 
symmetric from left to right, about a vertical axis through the upper vertex. The kth 
row has k + 1 terms. What is the significance of these numbers? 

One obvious significance is the relation of the triangle to the celebrated binomial 
theorem. Consider the quantity 

(a +b) =a" + taip + sap? + sat 3p? Fon 
K 3 k3 Kap? k k-i, Kyk 
Pras Toa t ra a 


Now we will examine this important formula in the first several specific instances: 


k=0: (a+b) =1 
k=1: (a+b)!=a+b 
k=2: (a+b? =a*?+2ab+h 
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3: (a+b =a? +3a°b 4+ 3ab* + b? 
4: (a+b) = at + 4a?b + 6a7b? + 4ab? + bt 
5: (a+b) =a + 5afb + 10a?b? + 10a*b? + 5abt + b’ 


k 
k 
k 


We see that the coefficients that occur for k = 0 are just the same as the zeroeth 
row of Pascal’s triangle: namely, a single digit 1. The coefficients that occur for 
k = 1 are just the same as the first row of Pascal’s triangle: namely, 1 and 1. The 
coefficients that occur for k = 2 are just the same as the second row of Pascal’s 
triangle: namely 1, 2, 1. And so forth. Of course if we think about how the binomial 
expression (a + b)* is multiplied out, then we see that the coefficients are formed 
by the very same rule that forms Pascal’s triangle. And that explains why the rows 
of Pascal’s triangle give the binomial coefficients. 

Another remarkable fact is that the sum of the numbers in the Ath row of 
Pascal’s triange is 2". For example, in the third row, 1 +3 + 3 + 1 = 8 = 2°. This 
is again a fundamental property of the binomial coefficients that can be verified 
with mathematical induction. 

A pleasing interpretation of the rows of Pascal’s triangle can be given in terms 
of coin tosses: 


e If we toss a coin once, then there are two possible outcomes: one heads and 
one tails. This is information tabulated in the first row one of Pascal’s triangle. 


e If we toss a coin twice, then there are three possible outcomes: two heads 
(which can occur just one way), a head and a tail (which can occur two 
ways—heads-tails or tails-heads), and two tails (which can occur just one 
way). This information is tabulated by 1 — —2 — —1 in the second row of 
Pascal’s triangle. 


e Ifwe toss acoin three times, then there are four possible outcomes: three heads 
(which can occur just one way), two heads and a tail (which can occur three 
ways—heads-heads-tails, heads-tails-heads, or tails-heads-heads), two tails 
and a head (which can occur three ways—tails-tails-heads, tails-heads-tails, 
or heads-tails-tails), and three tails (which can occur just one way). This infor- 
mation is tabulated by 1 — —3 — —3 — —1 inthe third row of Pascal’s triangle. 


e And so forth. 


Pascal’s triangle is also a useful mnemonic for carrying information about the 
choose function. For this purpose we number the rows 0, 1, 2, and so on as usual. 
We also number the terms in each row 0 ,1, 2, and so on. Now suppose we wish to 
know how many different ways we can choose three objects from among five (in 
fact this came up in an example in the last section—the answer was 10). Simply 
go to row five of the triangle, term three, and we see the answer to be 10. Or if we 
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want to know how many different ways to choose five items from among five (the 
answer is obviously 1), we go to row five and look at the fifth term (remembering 
to count from 0). 

Of course Pascal’s triangle is not magic—it is mathematics. And the mathemat- 
ical explanation behind everything we have said here is the fundamental formula 


n\  (n-1 z n—1 

k) \k-1 k 
You may test this formula by hand, or on your calculator. It is easy to confirm 
rigorously using mathematical induction. And it simply says (if we think of the kth 
element in the nth row of Pascal’s triangle as an) that ank = a(n-1)(k-1) + 4(n-1)k- 


This just says that the element a, is formed by adding the two elements in the row 
above it. 


6.9 Ramsey Theory 


Frank Plumpton Ramsey (1903—1930) was a Professor at Cambridge University 
in England. In his tragically short life he established himself as an important and 
influential mathematician. He studied Whitehead and Russell’s Principia Mathe- 
matica and offered a number of improvements, including ways to address Russell’s 
paradox. He wrote just one paper, entitled “On a Problem of Formal Logic” (written 
in the year of his death) on the topic discussed here. The topic that we now call 
Ramsey theory is today a keystone of combinatorial theory, and is important for 
many parts of mathematics. 

The general idea of Ramsey theory is to endeavor to find order in a set that is 
highly disordered. We will first illustrate the key idea with a popular example, and 
then we can discuss some of the more general principles. Let us consider an ordered 
sequence of questions: 


e How many people need to be present at a party in order to guarantee that there 
will be two people who are acquainted or two people who are not? This is 
a very simple question, and the answer is two. If there are two people at the 
party, then they either know each other or they do not. End of discussion. 


e How many people need to be present at a party in order to guarantee that 
there will be three people who are acquainted or two people who are not? 
This question is not quite so obvious. In fact five people will not do the trick 
(nor will four, three, or two people). To see this, it is useful and instructive to 
translate the question into one of graph theory. 
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Figure 6.11 A graph on five vertices with no solid triangle and no dashed triangle. 


Imagine that each person is represented by a point in the plane (which 
we think of as a vertex of a graph—see Chap. 8). Connect two points by a 
solid line if those two people are acquainted. Connect two points by a dashed 
line if those two people are not acquainted. Note that any two points will be 
connected by either a solid line or a dotted line, because every pair of people 
is either acquainted or not. So the question is: Can we always be sure that, 
no matter how the solid lines and the dashed lines are configured, there will 
always be a solid triangle or a dashed triangle? (Note here that a solid triangle 
corresponds to three people who are mutually acquainted. A dashed triangle 
corresponds to three people who are mutually unacquainted.) As we have 
said, for five vertices the answer is no. This fact is illustrated in Fig. 6.11. 

It turns out that six is the magic number in order to guarantee three mutual 
acquaintances or three mutual unacquaintances. (Or, in other words, in a 
complete graph on six vertices consisting of solid lines and dashed lines, 
there will always be either a solid triangle or a dashed triangle.) To see this, 
imagine that one person in the party of six is named Astrid. There are five 
other people besides Astrid. Either she is acquainted with three of those or 
she is unacquainted with three of those. If she is acquainted with three, then 
examine those three. Either two of them know each other, or not. If two of 
them know each other, then those two plus Astrid form a mutually acquainted 
trio, just as we seek. If none of the three knows each other then that is a 
mutually unacquainted trio, just as we seek. If instead Astrid is unacquainted 
with three members of the party, then examine those three. If two of them 
are unacquainted, then Astrid and those two form a mutually unacquainted 
trio. If instead the three are all acquainted, then those three form a mutually 
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acquainted trio. We see that, no matter what, there will be either a mutually 
acquainted trio or a mutually unacquainted trio. 

e Now how many people will it take at a party for there to be a mutually 
acquainted quartet or a mutually unacquainted quartet? This is a much harder 
problem. It can be shown that the right number is 18. 


In order to form a party so that there will be either five mutually acquainted 
people or five mutually unacquainted people it is known that the right size for 
the party is between 43 and 49 inclusive. But the correct value is not known. 
This problem is currently considered to be beyond our computing power. 


We list in a table now the known information about how many people r will be 
needed at a party in order to guarantee either k acquaintances or k unacquaintances. 
Note that, in our discussion above, we determined that when k = 2 then r = 2, 
when k = 3 then r = 6, when k = 4 then r = 18, and so forth. After k = 4, we 
indicate ranges of values for r because that is all that is known. 





k r 

2 2 

3 6 

4 18 

5 [43, 49] 

6 [102, 165] 
7 [205, 540] 
8 [282, 1870] 
9 [565, 6588] 
10 = [798, 23556] 
11 [1597, 184755] 


12 [1637, 705431] 


Here is a simple interpretation of the third line of our table in terms of Internet 
sites. Suppose we look at 18 randomly chosen Websites or URLs. Then either four of 
them are about the same topic or four of them are about four mutually distinct topics. 
Another way to look at the matter is this: among 18 randomly chosen Websites, 
either four of them are all linked to each other, or four of them have no mutual links. 


Exercises 


1. There are 300 adult people in a room, none of them obese. Explain why two 
of them must have the same weight (in whole numbers of pounds). 
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10. 
11. 


. There are 50 people in a room, none of them obese. Explain why two of them 


must have the same waist measurement (in whole numbers of inches). 


. Explain why the answer to Exercise 2 changes if the waist measurement is 


changed to whole numbers of millimeters. 


. There are 20 people sitting in a waiting room. The functionary in charge must 


choose five of these people to go to the green sanctuary and three of these 
people to go to the red sanctuary. In how many different ways can she do 
this? 


. Ina standard deck of 52 playing cards, in how many different ways can you 


form two-of-a-kind? 


. Ina standard deck of 52 playing cards, in how many different ways can you 


form four-of-a-kind? 


. Astandard die used for gambling is a six-sided cube, with the sides numbered 


1 through 6. You usually roll two dice at a time, and the two face-up values 
are added together to give your score. What is the likelihood that you will 
roll a seven? 


. Refer to the last exercise for terminology. What is the chance that you will 


roll two dice and get a two? How about a 12? Are there any other values that 
give this same answer? Why or why not? 


. Again refer to Exercise 7 for terminology. Now suppose that you are rolling 


three dice. Your score is obtained by adding together the three face values. 
What is your probability of getting a 10? 

Solve the recursion dy = 3, a} = —5, and a; = aj-; + 2a;-2 for j > 2. 
Suppose we take a finite collection of points in the plane and connect every 
pair with an edge. We color each edge either red or blue or green. Will there 
be a triangle of just one color? With 16 points the answer is “no” but with 17 
points the answer is “yes.” Discuss. 
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Matrices 


7.1 What Is a Matrix? 


A matrix is a rectangular array of numbers or variables or other algebraic objects. 
An example of a matrix is 


a b c de 
f &hi j 
k lmno 
pq4r s t 


We call this a 4 x 5 matrix because it has four rows and five columns. In general, 
an m x n matrix has m rows and n columns. 

When Arthur Cayley (1821—1895) invented matrices in the late nineteenth cen- 
tury, he bragged that he invented something that was of no earthly use. Rarely has 
a person’s assessment of his own work been more inaccurate. Today matrices are 
used in all parts of mathematics, in engineering and physics, in the social sciences, 
in statistics, and in any part of analytical thought where it is necessary to keep track 
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of (and to manipulate) information. What is important about matrices is that they 
can be combined in a number of useful ways—addition, multiplication, inversion, 
composition, and others—and each of these operations has significance for the in- 
formation that the matrix contains. We shall learn a bit about these ideas in the 
present chapter. 


7.2 Fundamental Operations on Matrices 


We typically denote a matrix by a capital roman letter like A or M. The elements of 
the matrix A are designated by a;;, where i is the row in which the element is located 
and j is the column in which the element is located. To take a specific example, 
consider the matrix 


3-.=1 Ao 2 
A= |-6 5 4 0 
1 9 14 -8 


For this matrix, a23 = 4 because the element of the matrix that is in the second 
row and third column is 4. Likewise, a32 = 9 and a33 = 14. Notice that, for this 
particular matrix, there are elements a, for 1 < i < 3 (because there are three rows) 
and for 1 < j < 4 (because there are four columns). 

We can add two matrices only when they have the same size. If A = (aj) is an 
m x n matrix and B = (b;,) is another m x n matrix, then their sum is A + B = 
(aj + bij) or 


A+B= (aij) + (di) = (aij + bi) 


EXAMPLE 7.1 
As aconcrete illustration of matrix addition, let 





3 —6 —5 3 
A = |-2 4 and B = 1 —6 
1 0 0 ti 
Then 
3 —6 —5 3 -2 —3 
A+B= |-2 4}/+] 1 —6ļ| = |-1 -2 
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EXAMPLE 7.2 
Matrices of the same size can also be subtracted. With the same A and B as in the 
previous example, we have 














3 -6 —5 3 8 —9 
A-—B= |-2 4] —- 1 -6] = {[-3 10 
1 0 0 1l 1 -Il 


Next we turn to multiplication of matrices. This is a bit more subtle than addition, 
and the form it takes may be something of a surprise. A first guess might be that 
the product of matrices A = (a;) and B = (b,) willbe A - B = (a; - bj). In other 
words, we guess that we would multiply matrices componentwise. This turns out 
not to be a Satisfactory way to define matrix multiplication, for it would result in 


equations like 
O 1 1 0\ (0 0 
1 0 0 1) \o 0 


In other words, the product of two nonzero matrices would be zero. This is not an 
attractive turn of events. We need a definition of matrix multiplication that avoids 
such problems, and also one that will preserve the essential information that is 
carried by the component matrices. These considerations motivate the definition 
that we are about to present. 

The key fact about matrix multiplication is that we can multiply a matrix A times 
a matrix B (in that order) provide that A is an m x n matrix and B is ann x k 
matrix. In other words, the number of columns in A must match the number of rows 
in B. As an instance, if 


2 —-1 
-5 2 6 9 
A= {[-5 4 and B= ( ) 
4 —3 8 12 
6 2: 


then we may calculate A -B (because the number of columns in A matches the 
number of rows in B) but we may not calculate B - A (because the number of 
columns in B is four while the number of rows in A is three and these do not 
match). 

Now how do we actually calculate a matrix product? If A = (a;;) isim and B = 


(b;s)1<r<n (so that the number of columns in A matches the number of rows in B) 
l<s<p 
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then we set C = (Cu) = A - B and we define 


n 
Cu = , Aikbku 
k=1 


This is a all a bit abstract, so let us look at a concrete example. 


EXAMPLE 7.3 
Let 
-3 2 10 e EAD 
A= E A 9 and B={|2 -2 1 4 
6 -5 -2 9 


Let C = (Cu) = A- B. According to the rule, 


3 
c = $ anba = (-3)-1+2-2410-6=61 
k=1 
Likewise 
3 
C12 = abe = (—3) -1 + 2. (—2) + 10 (—5) = —57 
k=l 


We calculate the other entries of C = A- B in a similar fashion. The final answer 
is 


61 -57 —9 80 
c=4-8=( ) 


52 —31 —40 101 


Notice here that A is a 2 x 3 matrix and B is a3 x 4 matrix. Thus we mentally 
cancel the matching 3s and see that the product must be a 2 x 4 matrix. 














EXAMPLE 7.4 
Let 
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and 
3 2 1 
B= 4 5 6 
—-8 0 -2 
Calculate both A- B and B . A. 
Solution: Now 
—13 12 7 
A-B= 6 37 28 
58 —4 14 
Also 
13 16 9 
B-A=| 22 33 —16 
—4 -16 —8 


One immediate lesson here is that A - B # B - A. Multiplication of matrices is not 
commutative. A second lesson is that the only time we can calculate both A - B and 
B - A is when both matrics are square and both are of the same size. 














7.3 Gaussian Elimination 


In high school algebra everyone learns how to solve a system of two linear equations 
in two unknowns: 


ax+by=a 

cx +dy= B 
You simply multiply the first equation by a constant so that its x-coefficient matches 
the x-coefficient of the second equation. Then subtraction eliminates the x-variable 


and one can solve directly for y. Reverse substitution then yields the value of the 
x-variable, and the system is solved. 
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Figure 7.1 Typical (empty) intersection of three lines in the plane. 


Matters are more complicated for systems of three equations in three unknowns, 
or more generally k equations in k unknowns. It becomes difficult to keep track of 
the information, and the attendant calculations can be daunting. In this section we 
introduce the method of Gaussian elimination, which gives a straightforward and 
virtually mechanical method for solving systems of linear equations. In fact it is 
straightforward to implement the method of Gaussian elimination on a computer; 
for a system of k equations in k unknowns it takes about k? calculations to find the 
solution. So this is a robust and efficient technique. 

We concentrate our efforts on systems of k equations in k unknowns because 
such a system will generically have a unique solution (that is, a single value for 
each of the variables that solves the system). When there are fewer unknowns than 
equations then the system will typically have no solutions.! When there are more 
unknowns than equations then the system will typically have an entire space of 
solutions. For example, the solution of the system 


x+z=4 
z=3 





'Think of intersecting lines in the plane. Two lines in the plane will usually have a point of intersection—as 
long as they are not parallel. Three lines in the plane will usually not have a mutual point of intersection—see 
Fig. 7.1. 
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are all points of the form (1, y, 3). In other words, the set of solutions forms an 
entire line. 
It requires ideas from linear algebra to handle these matters properly, and we 
cannot treat them here. So we will focus our attention on k equations in k unknowns. 
In the method of Gaussian elimination we will typically have a system of equa- 
tions 


1 1 1 

A,X, + 43X2 + 43X3 +--+ + Aix, = Q 
2 2 2 = 

axı + 45X2 + 43X3 +--+ + a2xX, = a 
k k k _ 

A,X, + 43X2 + 43X3 + +++ + agXk = Q 


We will study this system by creating the associated augmented matrix 


ds (Sly, el a 
a; a, a; +++ ala 
Bo a 2| y2 
aj ay a3 +++ ala 
kr ak ak k| yk 
| a! Qa, ay = a a | 


Now there are certain allowable operations on the augmented matrix that we use 
to reduce it to a normalized form. The main diagonal of this matrix is the line of 
terms given by aj, as ay at . We want all the terms be/ow this main diagonal to 
be equal to 0. And we want the terms on the main diagonal to all be 1s. The three 
allowable operations are: 


1. Switch the position of two rows. 
2. Multiply any row by a nonzero constant. 


3. Add a multiple of one row to another. 


It turns out that these three simple moves are always adequate to do the job. The 
ideas are best illustrated with some examples. 


EXAMPLE 7.5 
Let us solve the system 


x—2y+3z=4 
xty+z=-2 
—x+2y+z=1 
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using Gaussian elimination. We begin by writing the augmented matrix: 


i =2°3 l 
1 1 1ļ|-2 
E 2 1 1| 


Keeping in mind that our aim is to produce all zeros below the main diagonal, 
we subtract the first row from the second. This yields 


This produces a zero below the main diagonal. 
Next we add the first row to the last. The result is 


Finally let us multiply the second row by 1/3 and the third row by 1/4. The end 
result is 


1 -2 3| 4 
0 1 -23| -2 
0 0 1/ 5/4 


This is the normalized form that we seek. 
Writing our information again as a linear system, we have 


x —2y+3z=4 
Ox + ly — 2/3z = —2 
Ox +0y+z=5/4 
We may immediately read off from the last equation that z = 5/4. Substituting this 
information into the second equation gives y = —7/6. Lastly, putting both these 


values into the first equation yields x = —25/12. Thus the solution of our system 
is x = —25/12, y = —7/6, z = 5/4 or (—25/12, —7/6, 5/4). 
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We encourage you to check this solution by plugging the values into the three 
equations of the original system. 














This was so easy that we are granted the courage now to attack a system of four 
equations in four unknowns. It is gratifying how straightforward the procedure is: 


EXAMPLE 7.6 
Let us use Gaussian elimination to solve the system of four equations in four 
unknowns given by 


2x —y+3z+w=2 
x+ty-z-w=-l 
x+z+tw=4 
—x—y+2w=1 


As before, our first step is to write the associated augmented matrix: 


2 g TU 
| i i sii a 
1 0 1 1| 4 
“il 0 9 1 | 


Notice that we were careful in the third row to put a 0 in the second position because 
the third equation has no y. Likewise we put a 0 in the third position of the fourth 
row because the fourth equation has no z. 

Let us begin by multiplying the first row by 1/2. Thus we have 


1 —1/2 3/2 1/2| 1 
| 1 i ei i ej 


1 0 1 1 4 
—1 —1 0 2 1 
Now we substract the first row from the second row and the third row (we are 
performing two operations at once). Thus 


1 —1/2 3/2 1/2| 1 
0 3/2 —5/2 —3/2 a 
O 1/2 172: 1/⁄2| 3 

E =i 0 2 ‘4 
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Next we add the first row to the last. So 


f 
l 


—1/2 
3/2 
1/2 

~3/2 


3/2 1/2] 1 
—5/2 —3/2 3) 
-1/2 1/⁄2| 3 

3/2 5/2 A 


We see now the advantage of working systematically (that is, normalizing the 
first column first). For now we can add the second row to the fourth row. Therefore 


f 
: 


yo 
3/2 
1/2 
0 


3/2 1/2] 1 
=§/2 —3/2 | 
-1/2 12| 3 

= 1l | 


Now we subtract 1/3 the second row from the third. The result is 


F 
[o 


—1/2 
3/2 
0 

0 


3/2 1/2 d 
—5/2 —3/2| —2 

1/3 a 

-1 1 0 


The last step is to add three times the third row to the fourth. Hence 


; 
Lo 


—1/2 
3/2 
0 

0 


3/2 1/2 d 
—5/2 —3/2| —2 
1/3 1 | 11/3 
O 4] 11 


We conclude by multiplying each row by a suitable constant so that the lead term 
is 1. We finally have the augmented matrix 


—1/2 
1 
0 
0 


3/2 1/2 | 
—5/3 —1 | —4/3 
ie a A 
O 1| 11/4 
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Translating back to a linear system, we have 


lx — 1/2y +3/2z+1/2w =1 
Ox + ly —5/3z — lw = —4/3 
Ox +0y + 1z +3w = 11 
Ox + Oy +0z +w = 11/4 
We see immediately that w = 11/4. Back substituting, we find that z = 11/4. 
Again back substituting, we find that y = 6. Finally, the first equation tells us 


that x = —3/2. So the solution set is x = —3/2, y = 6, z = 11/4, w = 11/4 or 
(—3/2, 6, 11/4, 11/4). 














7.4 The Inverse of a Matrix 


The matrix 
1 0 0 0O 0 O 
0 1 0 0 0 O 
00 1 0 0 0 
T= 
00 0 0 1 0 
00 0 0 0 1 


is called the identity matrix. It has this name because if we multiply it with any 
square matrix A of the same size we obtain 7 - A = A . I = A. For example, if 


b 
e 


a 
soo 


> 
| 
a 


g h i 


then 


~ 

> 

II 
oO o| 
oO- © 
me O © 
lay 

Ss 

II 
wm QA Q 
roe -D 
~s 9 
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Likewise, 
abe 1 0 0 a 
A-I=ļ|d e f]-{0 1 O0J=]d 
g h i 0 0 1 g 


So Z plays the role of the multiplicative identity. 

Let A be a given square matrix. So A has size n x n for some positive integer n. 
We seek a matrix A’ so that A - A’ = J and A’- A = I. The matrix A’ will play the 
role of the multiplicative inverse of A. We call it the inverse matrix to A. 

Now it is a fact that not every matrix has a multiplicative inverse. Consider the 
example of 


If A’ is any matrix, 


then 


a 0 
A -A= 
(o) 


Thus A’ - A cannot possibly be the identity matrix. A similar calculation shows that 
A - A’ cannot possibly be the identity matrix. So the matrix A definitely does not 
have an inverse. 

It is useful to have a simple and direct calculation that will tell us when a given 
matrix has an inverse and when it does not. And in fact there is such a calculation 
which we shall now learn. It is based on the Gaussian elimination technique that 
was presented in the last section. 

The idea is best understood in the context of 2 x 2 matrices. Such a matrix which 
does not have an inverse is one in which the second row is a multiple of the first: 
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If we guess that some matrix 


is an inverse for A, then we would have 
/ b _/ b’ bd’ 
I=A.A'= mee mel 
Aaa’ +Abc’ dab! + Abd 
We see in the product that the second row is a constant multiple A times the first 
row. So the product matrix certainly cannot be the identity 7, and we see that A 
cannot be invertible. 


Thus, in the context of 2 x 2 matrices, the situation to rule out is that one row 
be a multiple of the other. It turns out that a convenient way to do this is by way of 


the determinant. If 
a b 
A= 
(C4) 


is a given matrix then its determinant is given by 
det A = ad — bc 


The two rows are not multiples of each other precisely when det A Æ 0. This will 
be precisely the circumstance in which we will be able to find an inverse for the 
matrix A. 


EXAMPLE 7.7 
Let 











Calculate det M. 





Solution: We see that 





det M = (—5)- (—6)— 2-4 = 22 40 
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The situation for a 3 x 3 matrix is analogous. Let 


a b c 
A=ļ|d e f 
g h i 


be a 3 x 3 matrix. The determinant is defined to be 


det A =a -det (7 1) -bde (¢ f) ed (¢ a 
h i g i g h 


This is a natural sort of generalization of the determinant for 2 x 2 matrices. We 
call 


the minor associated to the entry a. Likewise we call 


en 


the minor associated to the entry b and 


d e 
h ) 
the minor associated to the entry c. 

What we have done here is to expand the determinant by the first row. It is also 
possible to expand the determinant by any row or column, just so long as we assign 
the right plus or minus signs to each of the components. For the purposes of the 
present book, expansion by the first row will suffice. 


EXAMPLE 7.8 
Let 
—7 4 -3 
B = 2 -—5 3 
1 —3 -2 





Calculate det B. 
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Solution: We calculate that 


—5 3 
det B = (—7) - det ‘© k 


a. 3 
— 4. det 
a(i 3) 


2 —-5 
+ (3) det (7 E 


= (-7): 19 — 4- (-7) + (—3) - (-1) 
= —102 
+0. 














It is worth summarizing the main rule for inverses for which we have given an 
indication of the justification: 


Rule for inverses: A square matrix has an inverse if and only if the matrix 
has nonzero determinant. 


This still does not tell us how to find the inverse, but we shall get to that momentarily. 

There are in fact a number of ways to calculate the inverse of a matrix, and we 
shall indicate two of them here. The first is the method of Gaussian elimination, 
which is a powerful technique that can be used for many purposes. The idea is to 


take the given square matrix 
a b 
A= 
Z. 


and to augment it by adjoining the identity matrix as in the display below: 


Now the method of gaussian elimination allows us to perform certain operations 
on the rows of this augmented matrix, the goal being to reduce the square matrix 
on the left to the identity matrix. Whatever matrix results on the right will be the 
inverse. 
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The three allowable operations are: 


(1) We can switch the position of any two rows. 
(2) We can multiply any given row by a constant. 


(3) We can add any multiple of one row to another row. 
Let us illustrate the method with a concrete example. 


EXAMPLE 7.9 
Let 


Then det M = 2-5 — (—3) -4 = 22 Æ 0. Hence this matrix passes the test, and it 
has an inverse. 
We now examine the augmented matrix 


ls 
-3 5ļ|0 1 


Our job is to perform Gaussian elimination and to thereby transform the left-hand 
square matrix into the identity matrix. 
We begin by multiplying the first row through by 1/2. The result is 


Sd 
-3 5/0 1 


Now we add three times the first row to the second. We obtain the augmented matrix 


1 2 | 1/2 0 
O 1143/2 1 
What we have achieved thus far is that the first column of the left-hand square 


matrix looks like the first column of the identity. 
Now we subtract 2/11 of the second row from the first. The result is 


1 ae —2/11 
0 11] 3/2 1 
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Finally we multiply the second row by 1/11. In the end, then, we obtain 
1 0/5/22 —2/11 
O 1 | 3/22 1/11 


What we read off from this last augmented matrix is that the inverse of the 
original matrix M is 


a (5/22 -2/11 
“ Ce i) 














We invite you to test the result by multiplying M x M~!. 


EXAMPLE 7.10 
Let us calculate the inverse of the matrix 


3 0 1 
C=|0 1 2 
1 0 1 
We begin by calculating the determinant to make sure that the matrix C is invertible: 


dC asda. Naade" AaS 
Oe ee sacl a “lio 


=3.1—0.(-2)+1-(-1)=2#0 


Thus C passes the test and we may compute the inverse matrix. 
We write the augmented matrix as 
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Next we subtract the first row from the third row: 


1 0 1⁄3|1⁄3 0 0 
01 2| o0 “1-0 
0 0 2/3|-1⁄3 0 1 


Now we subtract three times the third row from the second row: 
1 0 13| 1/3 0 O 
0 1 0 1 1 -3 
O16 28 aaa 4 
As a penultimate step, we multiply the last row by 3/2. The result is: 
1 0 1/3 1/3 0 0 
0 1 0 1 1 -3 
0 0 1 {|-1/2 0 3/2 


Finally, we subtract one-third the last row from the first row. We obtain: 


1 0 0| 172 0 -1/2 
OF GeO)" 1 3 
OF 0 1|—1/⁄2 0 3/2 


Thus the inverse matrix to the original matrix C is 


1/2 0 —-1/2 
cl=] 1 1 -3 
-1/2 0 3/2 
As usual, we may check our work: 
3 0 1 1/2 0 -1/2 1 0 0 
cC.-cC®'=[0 1 2 1 1 -3 |=ļ01 0 
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7.5 Markov Chains 


A Markov chain—named after Andrei Andreyevich Markov (1856—1922)—is a 
random process in which future states of the process may depend on the present 
state but not on previous states. An example is the flip of a coin. You can flip a coin 
100 times and get heads every time (it’s not very likely, but it could happen). This 
tells you nothing about what the next flip will be: the chances are 50% heads and 
50% tails and that is all there is to it. Many gambling fallacies and “systems” are 
based on a misunderstanding of this simple principle. 

Another example of a Markov chain is the weather. The weather tomorrow may 
depend in part on the weather today, but it does not depend on the weather in the past. 

Let us begin to understand how to analyze a Markov chain by examining a 
concrete example. 


EXAMPLE 7.11 

Suppose that we live in a climate in which a sunny day is 80% likely to be followed 

by another sunny day. And a rainy day is 40% likely to be followed by another 

rainy day. We represent the state of the weather on any given day by a vector (s,r) 

where s stands for the likelihood of sun and r stands for the likelihood of rain. 
We may represent the stated probabilities (which are garnered from 5 years’ 

observation of the weather in this region) with the matrix 


What does this mean? Suppose that we begin our analysis on the first day—call 
it day 1—on which it is sunny. Thus the vector representing the status today is 
x") = (1, 0). Because it is definitely sunny and it is not raining. Now we calculate 
the next state—the weather tomorrow, which will be represented by a vector x— 
by applying the given probabilities to x“. This is done simply by multiplying the 
vector x‘) by the matrix P. Thus 


0.8 0.4 1 0.8 
D pP.xD = ; = 
ii > G J (;) i 
Notice that when we are in text we write our vectors (horizontally) as (s,r), but 


when we calculate with matrices we write the vectors vertically. This is customary 
in mathematics. 


154 Discrete Mathematics Demystified 


We have learned that, on the second day, there is 0.8 probability (or 80% prob- 
ability) that it will be sunny and 0.2 probability (or 20% probability) that it will 
rain. 

It is worth noting that the columns of the matrix P are nonnegative numbers 
that sum to 1. And the entries of our vectors are nonnegative numbers that sum to 
1. This is because they are probabilities (which always lie between 0 and 1), and 
they represent all possible outcomes. The sum of the probabilities of all possible 
outcomes will of course be 1. 

In the same fashion we can determine the likely weather on day 3. We see that 


Gap sO) =p 0.8 \ (08 0.4 0.8 \ (0.72 

ee ENO) TN 02-06) (0.2) 7 \0.28 
We find that, on the third day, there is 0.72 probability that it will be sunny and 0.28 
probability that it will rain. Again, the two probabilities add up to 1. 














As time goes on, the weather predictions become less and less accurate. We all 
know from experience that we cannot predict the weather on May 15 by examining 
the weather on April 15. Yet that is what we are doing in the last example: we 
predict the weather on day 2 by doing a calculation based on the weather values 
on day 1; we predict the weather on day 3 by doing a calculation with the weather 
values on day 2 (which were in turn determined from the weather values on day 1); 
and so forth. It is natural to wonder what “steady state” the weather may be tending 
to as time marches on to infinity. 

Formulated mathematically, we want to calculate the limit 


q= lim x”? (7.1) 


jroo 
This can be rewritten as 


q= lim P?1x 


joo 
where the power of P indicates that the matrix P is applied (or multiplied in) that 


many times. Applying P to both sides of the equation, we find that 


Pq=P tim | = lim P- PJ'x = lim P/x =q] 
J> œ 


jroo [rw 


In the last equality we use the fact—recorded in (7.1)—that q is the limit of iterates 
of P applied to x. 
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Thus we discover that the limiting, or steady state vector for our weather system 
is a vector q that is fixed by the matrix P. The operative equation is 


Pq=q 


EXAMPLE 7.12 
Let us find the steady state vector q for the weather system described in the last 
example. 

We seek a vector q = (q1, q2) such that the q;’s are nonnegative and sum to 1, 
and so that 


Pq=q 


This may be written out as 


This translates to 


0.8q; + 0.4q2 (1 
0.2q, + 0.6q2 E q2 


or 


0.89; + 0.4q2 = qı 
0.2q1 + 0.6q2 = q2 


Notice that these simplify to 


—0.2q;, +0.42 = 0 
0.24; —0.4q2 = 0 


These two equations are redundant, as they are multiples of each other. 
Thus we are reduced to solving 


~0.2g1 +0.4q = 0 
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alongside the condition 
gtq=l 


This is easy to do, and the solution is gj = 2/3 ~ 0.6666 and q2 = 1/3 ~ 0.3333. 
We conclude that the steady state weather system is (0.6666, 0.3333). In the 
long run, it is twice as likely that we will have sun as that it will rain. 














You Try It: A rain forest has three types of trees: young (15-25 years), middle- 
aged (26—45 years), and old (46 years or older). Let y(j), m(j), and o(j) denote 
the number of each type of tree, respectively. Let dy, dm, and d, denote the loss 
rates (expressed in percent) of each type of tree. Let b(j) denote the number of 
baby trees (less than 15 years old) in the forest and d; the loss rate for baby trees. 

Express b(j + 1) in terms of b(j), y(j), m(j), and o(/) (and the percentages 
indicated above). Write down a matrix that represents that Markov process that is 
present in this system. 


7.6 Linear Programming 


Linear programming is a fundamental mathematical tool that has existed—at least 
in its modern form—since M. K. Wood and G. B. Dantzig’s fundamental papers 
[WOD] and [DAN] in 1949. The basic idea is that one wants to reduce an im- 
possibly complex—or computationally expensive—problem to a more tractable 
problem using some kind of analysis. Dantzig’s method of linear programming is 
extraordinarily effective at this job. Today the major airlines use a version of the 
simplex method to schedule flights and many industries use linear programming 
to schedule job runs and other tasks. There are many standard computer packages 
that are dedicated to linear programming methods. 

In the most basic linear programming setting, one is given an objective function 
f that is linear: 


f 1, X2, ..., XN) = 41X1 + 42X2 +--+ + ANXN 


and one wishes to maximize (or minimize) this function as the variable point 
(x1, X2, ..., Xy) varies over a region R in space that is defined by some linear 
inequalities (called constraints): 


blxı +bix2 +: +bhxn < B! 
bxi + box. +- +b? xy < B 


bixi +bhxs +--+ bhan < Bi 
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Figure 7.2 A feasible region in space. 


See Fig. 7.2. We call this region in space the feasible region. A linear inequality such 
as we see in the last display describes a halfspace. So the points that simultaneously 
satisfy all the linear inequalities will typically describe a convex polytope in space. 
That is what the figure suggests. 

Now the graph of a linear function such as f above is just a plane or (in higher 
dimensions) a hyperplane. So picture such a graph lying over the feasible region 
R. See Fig. 7.3. It is not difficult to picture that, in some directions, the graph of 
the linear function goes up hill. And there will be a particular direction in which 





Figure 7.3 The graph of a linear function over a feasible region. 
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it goes uphill as steeply as possible. The function will reach its maximum value at 
that point on the graph which is as high as possible, and that will occur at a corner 
of the feasible region. This is the basic idea behind linear programming. 

A detailed course in the theory of linear programming would teach you matrix 
techniques for finding the corner or vertex at which the maximum is achieved. This 
is actually quite an interesting process. In a typical application—such as scheduling 
flights for an airline—the feasible region will have thousands of vertices, and the 
objective function f will have hundreds or thousands of variables. So one certainly 
would not want to check every possibility by hand. 

Of course a computer can help, but these problems are so large that even a 
computer gets bogged down with the calculations. The simplex method of Dantzig 
gives a very efficient way of zeroing in on the vertices where the extrema lie. 

In this very brief treatment we cannot get into all the particulars of the linear 
programming method. But we can work some simple examples that illustrate the 
idea behind the solution of a linear programming problem. 


EXAMPLE 7.13 
Find the maximum value of the objective function 


f(x,y) =2x +7y 


over the feasible region 





y—2x <3 
yt3x>—-4 
y—10x > -2 











Solution: Figure 7.4 illustrates the feasible region. 

The three points of intersection of the edges (that is, the vertices or corners) are 
easy to solve for using elementary techniques. These are (—7/5, 1/5), (5/8, 17/4), 
and (—2/13, —46/13). Now we know from Dantzig’s theory of linear programming 
(that is, the simplex method) that the extrema of the function f will occur at one 
of these three vertices. This problem is small enough so that it is tractable to just 
plug in the points and examine the values: 
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Figure 7.4 The feasible region for Example 7.13. 


(-)-(Ga)--Ga) 


Plainly the greatest value is 31, and it is assumed at the point (5/8, 17/4). 
We may also note that the least value is —326/13, and it is assumed at the point 
(—2/13, —46/13). 














EXAMPLE 7.14 
Find the extrema of the objective function 


f(x,y,z) = 3x -—4y4+7z 
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Figure 7.5 The feasible region for Example 7.14. 


on the feasible region 





z>0 











Solution: Figure 7.5 illustrates the feasible region. It is of course a cube. 

It is no trouble at all to observe that the vertices (corners) are the points (1, 0, 0), 
(0, 1, 0), (0, 0, 1), (0, 1, 1), (1,0, 1), (1, 1, 0), (0, 0, 0), and (1, 1, 1). We may cal- 
culate the values of f at those points directly: 


J, 0, 0) = 3 
f(@, 1,0) = —4 
f(0,0,1) =7 
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f0, 1,1) =3 
f 1,0, 1) = 10 
fC, 1,0) =—1 
f(,0,0) =0 
fd, 1,1) =6 


We see that the greatest value of f is 10, and it is assumed at the point (1, 0, 1). 
The least value of f is —4, and it is assumed at the point (0, 1, 0). 














Exercises 
In Exercises 1 and 2, say what the dimensions of the matrix are. 
34.02 ai, 
1 4 -9 -2 
"12 1 6 
5 0 0 
3 5 -4 2 0 
2.71 1 9 4 -2 
6 4 3 2 -l 


In Exercises 3, 4, 5, and 6, calculate the indicated matrix operation. 


S 6 2 8 
Aa DETTAT a49 
6 —5 4 nen 
al, 3 
3 44 
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. Calculate the inverse of the matrix 


1 0 1 
O 1 1 
2 1 0 


. Without actually calculating the inverse, say whether this matrix has an in- 


verse: 
3 2 1 
al -3 -2 
0 1 6 


. An unfair coin has twice the likelihood of landing heads as tails. Describe 


the coin flips as a Markov process. 


CHAPTER 8 





Graph Theory 


8.1 Introduction 


We learn even in high school about graphs of functions. The graph of a function 
is usually a curve drawn in the x-y plane. See Fig. 8.1. But the word “graph” has 
other meanings. In finite or discrete mathematics, a graph is a collection of points 
and edges or arcs in the plane. Fig. 8.2 illustrates a graph as we are now discussing 
the concept. 

Leonhard Euler (1707—1783) is considered to have been the father of graph 
theory. His paper in 1736 on the seven bridges of Königsberg is considered to have 
been the foundational paper in the subject. It is worthwhile now to review that topic. 

Königsberg is a town, founded in 1256, that was originally in Prussia. After 
a stormy history, the town became part of the Soviet Union and was renamed 
Kaliningrad in 1946. In any event, during Euler’s time the town had seven bridges 
(named Krämer, Schmiede, Holz, Hohe, Honig, K6ttel, and Griinespanning) span- 
ning the Pregel River. Fig. 8.3 gives a simplified picture of how the bridges were 
originally configured (two of the bridges were later destroyed during World War IL, 
and two others demolished by the Russians). The question that fascinated people 
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Figure 8.1 A graph of a function in the plane. 


Figure 8.2 A graph as a combinatorial object. 


Figure 8.3 The seven bridges at Königsberg. 
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in the eighteenth century was whether it was possible to walk a route that never 
repeats any part of the path and that crosses each bridge exactly once. 

Euler in effect invented graph theory and used his ideas to show that it is impos- 
sible to devise such a route. We shall, in the subsequent sections, devise a broader 
version of Euler’s ideas and explain his solution of the Königsberg bridge problem. 


8.2 Fundamental Ideas of Graph Theory 


A graph consists of vertices and edges. A graph may be connected, that is, consist of 
one continuous piece or disconnected, that is, consist of more than one contiguous 
piece. See Fig. 8.4. Notice in the figure that the edges of the graph determine certain 
two-dimensional regions, or faces, in the graph. The graph on the left defines 3 faces, 
and the graph on the right defines 2 faces. It is customary in this subject to think of 
the graph as living on a sphere rather than in the plane, so that the exterior region 
(see Fig. 8.5) counts as a face. 
Euler’s first fundamental insight about graphs is the following theorem: 


Theorem 8.1 Let G be any connected graph on the sphere. Let V be the number 
of vertices, E the number of edges, and F the number of two-dimensional regions 
(or faces) defined by the edges. Then 


V-E+F=2 
We should like to spend some time explaining why this important theorem is 


true. Begin with the simplest possible graph—see Fig. 8.6. It has just one vertex. 
There are no edges. And there is one face—which is the entire region of the sphere 


</ 


a connected graph a disconnected graph 


Figure 8.4 Connected and disconnected graphs. 
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Figure 8.5 A graph on the sphere. 


complementary to the single vertex. Thus V = 1, E = 0, and F = 1. Thus we see 
that 


V-E+F=1-04+1=2 


So Euler’s theorem is true in this very simple case. 

Now imagine making the graph more complex. We add a single edge, as shown 
in Fig. 8.7. How have the numbers changed? Well, now E = 1. But there is an 
additional vertex (that is, there is a vertex at each end of the edge), so V = 2. And 
there is still a single face, so F = 1. Now 


V-E+F=2-14+1=2 


Thus Euler’s theorem remains true. 

Now the fundamental insight is that we can build up any graph by adding one 
edge at a time. And there are only three ways that we may add an edge. Let us 
discuss them one at a time: 


e We can add a new edge so that it has one vertex on the existing graph and 
one vertex free—see Fig. 8.8. In doing so, we add one vertices, add one edge, 
and do not change the number of faces. Thus, in the formula V — E + F, we 


Figure 8.6 Beginning of the proof of Euler’s formula. 
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Figure 8.7 Euler’s formula for a more complicated graph. 


have increased E by 1 and increased V by 1. These two increments cancel 
out, so the sum of 2 remains unchanged. 


e We can add the new edge so that both ends (the vertices) are at the same point 
on the existing graph—see Fig. 8.9. Thus we have added one edge, no vertices, 
and one face. As a result, in the formula V — E + F, we have increased E 
by 1 and increased F by 1. These two increments cancel out, so the sum of 2 
remains unchanged. 


e We can add the new edge so that the two ends are at two different vertices of 
the existing graph—see Fig. 8.10. so we have added one edge and one face, but 
no vertices. As a consequence, in the formula V — E + F, we have increased 
E by 1 and increased F by 1. But there are no new vertices. Therefore the 
two increments cancel out, and the sum of 2 remains unchanged. 


This exhausts all the cases, and shows that, as we build up any graph, the Euler sum 
V — E + F will always be 2. 

We call 2 the Euler characteristic of the sphere. It is a fundamental geometric 
invariant of this surface. It turns out that the Euler characteristic of a torus—see 
Fig. 8.11—is not 2. It is in fact 0, as the figure indicates. 


i 


Figure 8.8 The inductive step in the proof of Euler’s formula. 
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Figure 8.9 The inductive step in the proof of Euler’s formula. 
Figure 8.10 The inductive step in the proof of Euler’s formula. 


Figure 8.11 The Euler characteristic of a torus. 
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8.3 Application to the Konigsberg 
Bridge Problem 


Before returning to Euler’s original problem, let us look at an even more funda- 
mental question. Let {v1, v2, ..., vg} be a collection of vertices in the plane (or on 
the sphere). The complete graph on these vertices is the graph that has an edge 
connecting any two of the vertices. As an instance, the complete graph on three 
vertices is shown in Fig. 8.12. The complete graph on four vertices is shown in 
Fig. 8.13. We may ask whether the complete graph on five vertices can be drawn 
in the plane—or on the sphere—without any edges crossing any others. If that 
were possible, then the resulting graph would have five vertices (so V = 5) and 
() = 10 edges (so E = 10) and (3) = 10 faces (because every face would have 
to be a triangle). But then V — E + F = 5 — 10 + 10 = 5, and that is impossible. 
The answer is supposed to be 2! We say that the complete graph on five vertices 
cannot be imbedded in the plane (or in the sphere). This simple example already 
illustrates the power of Euler’s formula. 

Now let us examine the seven bridges of Königsberg. In Fig. 8.14 we convert the 
original Königsberg bridge configuration into a planar graph. We do so by planting 
a flag in each land mass defined by the river (there are four land masses in Fig. 8.3) 
and connecting two flags if there is a bridge between the two corresponding land 
masses. If the order of a vertex in a graph is the number of edges that meets at that 
vertex, then we see that the graph in Fig. 8.14 has three vertices of order three. This 
fact has the following significance. 

Imagine a traveler endeavoring to satisfy the stipulations of the Königsberg 
bridge problem. If this traveler enters a vertex of order three, then that traveler will 
leave that vertex on a different edge (since the traveler is not allowed to traverse 
the same edge twice in this journey). But then, if the traveler ever enters that vertex 
again, he/she cannot leave. There is no edge left on which the traveler can leave 
(since there are only three edges total at that vertex). So the journey would have 


Figure 8.12 The complete graph on three vertices. 
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Figure 8.13 The complete graph on four vertices. 


to end at that vertex. That is OK, but there are three vertices of order three. The 
journey cannot end at all three of those vertices—just at one of them. This is a 
contradiction. 

We see therefore, by Euler’s original analysis, that it is impossible to find a 
journey that traverses all seven bridges while not repeating any part of the path. 


You Try It: Remove one of the seven bridges from the Pregel River. How does 
this affect the Königsberg bridge problem? Is it now possible to chart a path, never 
repeating any part of the route and crossing each bridge precisely once? Does it 
matter which one of the seven bridges you remove? 


Let us say that a graph has an Euler path if it traces each edge once and only 


once. We have shown that the graph corresponding to the original seven Königsberg 
bridges does not have an Euler path. 


Figure 8.14 The graph corresponding to the Königsberg bridge configuration. 
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Figure 8.15 Two graphs for Euler analysis. 


In a given graph, call a vertex odd if an odd number of edges meet at that vertex. 
Call the vertex even if an even number of edges meet at that vertex. 


You Try It: Explain why, if a graph has more than two odd vertices, then it does 
not have an Euler path. 


You Try It: Examine each of the graphs in Fig. 8.15. Which of these has an Euler 
path? If it does, then find this path. 


You Try It: Examine each of the graphs in Fig. 8.16. Which of these has an Euler 
path? If it does, then find this path. 


You Try It: Examine each of the graphs in Fig. 8.17. Which of these has an Euler 
path? If it does, then find this path. 


= 


Figure 8.16 Two more graphs for Euler analysis. 


FO) 172 Discrete Mathematics Demystified 


Figure 8.17 Yet two more graphs for Euler analysis. 


8.4 Coloring Problems 


Many mathematic problems originate among professional mathematicians at uni- 
versities. After all, they are the folks who spend all day every day thinking about 
mathematics. They are well qualified to identify and develop interesting directions 
to investigate. But it also happens that some fascinating and long-standing mathe- 
matics problems will originate with laymen. The celebrated four-color problem is 
an example of such. 

In 1852 Francis W. Guthrie, a graduate of University College London, posed the 
following question to his brother Frederick: 


Imagine a geographic map on the earth (that is, a sphere) consisting of coun- 
tries only—no oceans, lakes, rivers, or other bodies of water. The only rule is 
that a country must be a single contiguous mass—in one piece, and with no 
holes—see Fig. 8.18. As cartographers, we wish to color the map so that no 
two adjacent countries will be of the same color (Fig. 8.19—note that R, G, 
B, and Y stand for red, green, blue, and yellow). How many colors should 
the map-maker keep in stock so that he can be sure he can color any map? 


Frederick Guthrie was a student of Augustus De Morgan (1806-1871), and 
ultimately communicated the problem to his mentor. The problem was passed 
around among academic mathematicians for a number of years [in fact De Morgan 
communicated the problem to William Rowan Hamilton (1805—1865)]. The first 
allusion in print to the problem was by Arthur Cayley (1821—1895) in 1878. 

The eminent mathematician Felix Klein (1849-1925) in Gottingen heard of the 
problem and declared that the only reason the problem had never been solved is 
that no capable mathematician had ever worked on it. He, Felix Klein, would offer 
a class, the culmination of which would be a solution of the problem. He failed. 
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Not a country Not a country 


This is a country 
Figure 8.18 Map coloring. 


In 1879, A. Kempe (1845-1922) published a solution of the four-color problem. 
That is to say, he showed that any map whatever could be colored with four colors. 
Kempe’s proof stood for 11 years. Then a mistake was discovered by P. Heawood 
(1861-1955). Heawood studied the problem further and came to a number of fas- 


cinating conclusions: 
‘ d 


Figure 8.19 The four-color problem. 
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Kempe’s proof, particularly his device of “Kempe chains,” does suffice to 
show that any map whatever can be colored with five colors. 


Heawood showed that if the number of edges around each region in the map 
is divisible by three, then the map is four-colorable. 


Heawood found a formula that gives an estimate for the “chromatic number” 
of any surface. Here the chromatic number x(g) of a surface is the least 
number of colors it will take to color any map on that surface. We write the 
chromatic number as x (g). In fact the formula is 


x) < É (7 F V381) | 


so long as g > 1. 

Here is how to read this formula. It is known, thanks to work of Camille 
Jordan (1838—1922) and August Möbius (1790-1868), that any surface in 
space is a sphere with handles attached (see Fig. 8.20). The number of handles 
is called the genus, and we denote it by g. The Greek letter chi (x) is the 
chromatic number of the surface—the least number of colors that it will take 
to color any map on the surface. Thus x (g) is the number of colors that it will 
take to color any map on a surface that consists of the sphere with g handles. 
Next, the symbols | | stand for the “greatest integer function.” For example 
L2] = 4 just because the greatest integer in the number “four and a half” is 4. 
Also |7 | = 3 because x = 3.14159 . . . and the greatest integer in the number 
piis 3. 





Figure 8.20 The structure of a closed surface in space. 
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Figure 8.21 The torus is a sphere with one handle. 


Now a sphere is a sphere with no handles, so g = 0. We may calculate that 
1 1 
x(g) < E (7+ V48-04 i)| = E ®| =4 


This is the four-color theorem! Unfortunately, Heawood’s proof was only 
valid when the genus is at least 1. It gives no information about the sphere. 
The torus (see Fig. 8.21) is topologically equivalent to a sphere with one 
handle. Thus the torus has genus g = 1. Then Heawood’s formula gives the 
estimate 7 for the chromatic number. And in fact we can give an example of 
a map on the torus that requires seven colors. Here is what Fig. 8.22 shows. 
It is convenient to take a pair of scissors and cut the torus apart. With one 
cut, the torus becomes a cylinder; with the second cut it becomes a rectangle. 


A ana 





Figure 8.22 The torus as a rectangle with identifications. 
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Figure 8.23 A map on the torus that requires seven colors. 


The arrows on the edges indicate that the left and right edges are to be 
identified (with the same orientation), and the upper and lower edges are to 
be identified (with the same orientation). We call our colors “1,” “2,” “3,” “4,” 
“5? “6, “7.” The reader may verify that there are seven countries shown in our 
Fig. 8.23, and every country is adjacent to (that is, touches) every other. Thus 
they all must have different colors! This is a map on the torus that requires 
seven colors; it shows that Heawood’s estimate is sharp for this surface. 


Heawood was unable to decide whether the chromatic number of the sphere 
is 4 or 5. He was also unable to determine whether any of his estimates for the 
chromatic numbers of various surfaces of genus g > 1 were sharp or accurate. That 
is to say, for the torus (the closed surface of genus 1), Heawood’s formula says 
that the chromatic number does not exceed 7. Is that in fact the best number? Is 
there a map on the torus that really requires seven colors? And for the torus with 
two handles (genus 2), Heawood’s estimate gives an estimate of 8. Is that the best 
number? Is there a map on the double torus that actually requires eight colors? And 
so forth: we can ask the same question for every surface of every genus. Heawood 
could not answer these questions. 


8.4.1 MODERN DEVELOPMENTS 


The late nineteenth century saw more alleged solutions of the four-color problems, 
many of which stood for as long as 11 years. Eventually errors were found, and the 
problem remained open on into the twentieth century. 

What is particularly striking is that Gerhard Ringel (1919-— 2008) and J. W. T. 
Youngs (1910-1970) were able to prove in 1968 that all of Heawood’s estimates, for 
the chromatic number of any surface of genus at least 1, are sharp. So the chromatic 
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number of a torus is indeed 7. The chromatic number of a “double-torus” with 
two holes is 8. And so forth. But the Ringel/Youngs proof, just like the Heawood 
formula, does not apply to the sphere. They could not improve on Heawood’s result 
that five colors will always suffice. The four-color problem remained unsolved. 

Then in 1974 there was blockbuster news. Using 1200 hours of computer time 
on the University of Illinois supercomputer, Kenneth Appel and Wolfgang Haken 
showed that in fact four colors will always work to color any map on the sphere. 
Their technique is to identify 633 fundamental configurations of maps (to which 
all others can be reduced) and to prove that each of them is reducible to a simpler 
configuration. But the number of “fundamental configurations” was very large, and 
the number of reductions required was beyond the ability of any human to count. 
And the reasoning is extremely intricate and complicated. Enter the computer. 

In those days computing time was expensive and not readily available, and Appel 
and Haken certainly could not get a 1200-hour contiguous time slice for their work. 
So the calculations were done late at night, “off the record,” during various down 
times. In fact, Appel and Haken did not know for certain whether the calculation 
would ever cease. Their point of view was this: 


1. If the computer finally stopped then it will have checked all the cases and the 
four-color problem was solved. 


2. If the computer never stopped then they could draw no conclusion. 


Well, the computer stopped. But the level of discussion and gossip and disagree- 
ment in the mathematical community did not. Was this really a proof? The computer 
had performed tens of millions of calculations. Nobody could ever check them all. 

But now the plot thickens. Because in 1975 a mistake was found in the proof. 
Specifically, there was something amiss with the algorithm that Appel and Haken 
fed into the computer. It was later repaired. The paper was published in 1976. The 
four-color problem was declared to be solved. 

In a 1986 article, Appel and Haken point out that the reader of their seminal 
1976 article must face 


1. 50 pages containing text and diagrams; 
2. 85 pages filled with almost 2500 additional diagrams; 
3. 400 microfiche pages that contain further diagrams and thousands of indi- 


vidual verifications of claims made in the 24 statements in the main section 
of the text. 


But it seems as though there is always trouble in paradise. Errors continued to 
be discovered in the Appel/Haken proof. Invariably the errors were fixed. But the 
stream of errors never seemed to cease. So is the Appel/Haken work really a proof? 
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Well, there is hardly anything more reassuring than another, independent proof. 
Paul Seymour and his group at Princeton University found another way to attack 
the problem. In fact they found a new algorithm that seems to be more stable. They 
also needed to rely on computer assistance. But by the time they did their work 
computers were much, much faster. So they required much less computer time. In 
any event, this paper appeared in 1994. 


8.4.2 DENOUEMENT 


It is still the case that mathematicians are most familiar with, and most comfortable 
with, a traditional, self-contained proof that consists of a sequence of logical steps 
recorded on a piece of paper. We still hope that some day there will be such a 
proof of the four-color theorem. After all, it is only a traditional, Euclidean-style 
proof that offers the understanding, the insight, and the sense of completion that all 
scholars seek. 

And there are new societal needs: theoretical computer science and engineering 
and even modern applied mathematics require certain pieces of information and 
certain techniques. The need for a workable device often far exceeds the need to be 
certain that the technique can stand up to the rigorous rules of logic. The result may 
be that we shall reevaluate the foundations of our subject. The way that mathematics 
is practiced in the year 2100 may be quite different from the way that it is practiced 
today. 


8.5 The Traveling Salesman Problem 


It is a charming fact of life that some of the most fascinating mathematical problems 
have utterly simple statements that can be understood by most anyone. Fermat’s 
last theorem is such a problem. The four-color problem (see Sec. 8.4) is another. A 
problem that is fairly old, and is still of preeminent importance both for mathematics 
and logic and also for theoretical computer science, is the celebrated “traveling 
salesman problem” (TSP). We shall discuss this problem in the present section. 
First studied in the mid-nineteenth century by William Rowan Hamilton (1805-— 
1865) and Thomas Kirkman (1806-1895), the question concerns traveling a circuit 
in the most efficient fashion. The question is often formulated in terms of a traveling 
salesman who must visit cities C1, C2, ..., Cy. There is a path connecting any city 
to any other, and a cost assigned to each path. The goal of the salesman is to begin 
at some city—say C,;—and to visit every city precisely once. The trip is to end 
again at C1. And obviously the salesman wants to minimize his cost. See Fig. 8.24. 
We may use our knowledge of counting techniques to quickly get an estimate 
of the number of possible paths that the salesman can take. For the first leg of the 
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Figure 8.24 The traveling salesman problem. 


trip, the salesman (beginning at C1) may choose to go to C2 or C3 or... Cx. Thus 
there are (k — 1) choices. For the next leg, the salesman may choose any of the 
remaining (k — 2) cities. So there are (k — 2) choices. And so forth. In summary, 
there are 


(k=1)-(k-2)---3-2-1=(k-1)! 


possible paths that the traveling salesman might take. According to a formula of 
Stirling, 


(k — 1)47! 
k- DL yak- 1): 5 (8.1) 


In particular, the number is exponential in k. This means that the problem is difficult, 
and takes a great many steps. 

A variety of techniques are known for estimating the correct solution to the 
traveling salesman problem. In particular, if one is willing to settle for a path that 
cost not more than twice the optimal path, then one may find a solution rather 
efficiently. But the truly optimal solution can be quite complex to find. As an 
instance, in 2001 the optimal tour of 15,112 German cities was found—and it was 
shown that the given solution was indeed optimal. The calculation was performed 
on a 110-CPU parallel-processing computer and required 22.6 years of computer 
time on a single 500 MHz Alpha processor. In 2006 the optimal circuit for 85,900 
cities was found. It took over 136 computing years on a CONCORDE chip. 
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A current and timely instance of the traveling salesman problem occurs in the 
manufacture of electronic circuits. These days such circuits are solid state, and all 
the components are mounted on a board. There are often thousands of units, and 
much of the manufacture is automated. Certainly part of the process is that a drill 
head must travel around the circuit board making holes for the various electronic 
components. Since many tens of thousands of circuit boards will be manufactured, 
one wants the tour of the drill head to be as efficient as possible. 

It has been argued that the single most important problem today in the mathe- 
matical sciences is the P vs. NP problem. Roughly speaking, this is a problem of 
considering when a problem can be solved in polynomial time and when it will 
take exponential time. As a simple instance, the problem of taking N randomly 
shuffled cards and putting them in order is of polynomial time because one first 
goes through all N cards to find the first card, then one looks through the remaining 
(N — 1) cards to find the second card, and so forth. In short, it takes at most 


N(N +1) 
a < 


N+(N-1)+(N-2)+---34241= N? 

steps to sort the cards. This is a polynomial estimate on the number of steps. By 
contrast, the traveling salesman problem with N cities takes about (N — 1)! ~ 
(N/e)™—! steps, and hence is of exponential complexity. It is known that a complete 
solution to the traveling salesman problem is logically equivalent to a solution of 
the P/NP problem. This is one of the Clay Mathematics Institute’s 1 million dollar 
Millenium Prize Problems! 

We conclude with a few words about how one might find an efficient (though 
not necessarily optimal) circuit for the traveling salesman problem. There is a 
commonly used algorithm, both in tree theory and in graph theory, for finding 
optimal paths and circuits. It is called the greedy algorithm. The idea is simple and 
intuitive. Suppose we are given a layout of cities and paths connecting them, with 
a cost connected with each path—see Fig. 8.24. Notice in the figure that every city 
(node) is connected to every other city. We begin our circuit at the node C with the 
cheapest path between two cities. In the figure that is path from C to D that costs 
$55. So, for our first step, we pass from C to D. The next arc must begin at D. We 
choose the cheapest remaining arc emanating from D (but of course not the one we 
have already traversed). In the figure, that takes us to F over the arc that costs $45. 
And we continue in this fashion, at every step choosing the cheapest arc possible. 
This is the greedy algorithm. 

There is no guarantee that the greedy algorithm produces the truly optimal circuit 
for the traveling salesman. But it is a theorem that it produces a circuit that costs 
no more than twice as much as the optimal amount. For many applications this is 
adequate. 
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Exercises 


1. Give an example of a graph on five vertices without an Euler path. 
2. Give an example of a graph on five vertices with two distinct Euler paths. 


3. Imagine a torus with two handles. What would be the correct Euler formula 
for this surface? It should have the form 


V — E + F = (some number) 


What is that number? The number x on the right hand side is called the Euler 
characteristic of the surface. 

4. Consider the complete graph on six vertices. How many edges does it have? 
How many faces? 


5. How many edges does the complete graph on k vertices have? 


6. Consider a graph built on two rows of three vertices for a total of six vertices. 
Construct a graph by connecting every vertex in the first row to every vertex 
of the second row and vice versa. How many edges does this graph have? 


7. Consider the standard picture of a five-pointed star. This can be thought of 
as a graph. How many vertices does it have? How many edges? 


8. Give an example of a graph with more vertices than edges. Give an example 
of a graph with more edges than vertices. 


9. If a surface can be described as a “sphere with g handles,” then we say it has 
genus g. Thus a lone sphere has genus 0, a torus has genus 1, and so forth. 
Based on your experience with Exercise 3 above, posit a formula that relates 
the Euler characteristic x with the genus g. 
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CHAPTER 9 





Number Theory 


9.1 Divisibility 
Let N denote the collection of all positive integers. We call these the natural num- 
bers. Number theorists study the properties of N. 
Let n and k be two positive integers. We say that k divides n if there is another 


positive integer m such that n = m - k. For example, letn = 12 and k = 3. Certainly 
3 divides 12 because 


12=4-3 
On the other hand, 5 does not divide 13 because there is no positive integer m with 
13=m-5 


In fact if we divide 5 into 13 we find that it goes twice with a remainder of 3. We 
may write 


13=2-5+3 


Whenever the remainder is nonzero then the division is not even. 


184 Discrete Mathematics Demystified 


Turning this reasoning around, we may enunciate the Euclidean algorithm. Let 
n be a positive integer and d another positive integer. Then there are two other 
positive integers (the quotient q and the remainder r) such that 


n=q:d+r (9.1) 


This simple formula—which dates back more than 2000 years and is commonly 
called the Eucliean algorithm—is the basis for our understanding of division. It 
says that we may divide any integer n by any other integer d. We get an answer, 
or quotient, q and a remainder r. When the remainder r is equal to O then the 
Eq. (9.1) becomes 


n=q-d 


and we see that d divides n. 

The Euclidean algorithm has many uses. For example, it is a matter of some 
interest to find the greatest common divisor of two given positive integers. This can 
be done by educated guessing. For example, the greatest common divisor of 84 and 
18 is 6. But the Euclidean algorithm gives a step-by-step method (that one could 
program onto a computer) for finding this number. 

How does the method work? We divide 18 into 84 with remainder 12: 


84=4-18+4 12 (9.2) 
Now we divide the remainder 12 into the divisor 18: 
18=1-12+6 (9.3) 
Next we repeat the algorithm by dividing the remainder 6 into the divisor 12: 
12=2-6 (9.4) 


We see that there is no remainder, so the process must stop. The greatest common 
divisor is 6 (which is the last divisor that has occurred). 

We can also see, by examining this last example, why the method of the Euclidean 
algorithm must work to find the greatest common divisor. For Eq. (9.4) shows that 
6 must divide 12. Then Eq. (9.3) shows that 6 must divide 18. And, finally, Eq. (9.2) 
shows that 6 must divide 84. So certainly we see that 6 is a common divisor of 18 
and 84. Reversing the reasoning, we see that 6 is the greatest number that could 
have this property. 

We say that two positive integers m and n are relatively prime if they have no 
common divisors (except of course 1). If m and n do have a common divisor—say 
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k—then any prime factor of k will also be a common divisor of m and n. So another 
way to express the concept is that m and n are relatively prime if they have no 
common prime divisors. 

This suggests an ad hoc way of telling when two given integers are relatively 
prime. Just write out their prime factorizations and see whether there are any com- 
mon factors. As an example, 


90 =2.3?.5 
and 
17=7-11 


We see explicitly that the numbers 90 and 77 have no common prime factors. So 
they are relatively prime. 
By contrast, consider 


360 = 2°. 37-5 
and 
108 = 2? . 33 


We see that these two integers have two factors of 2 in common and two factors 
of 3 in common. Altogether then, the greatest common divisor of 360 and 108 is 
4-9=2?.3* = 36. 

We also know from the discussion above that the Euclidean algorithm may be 
used to determine the greatest common divisor of two given integers. If that greatest 
common divisor is 1, then the two integers are relatively prime. 


9.2 Primes 


The building blocks of the natural numbers are the prime numbers. A positive 
integer is called prime if it is not divisible by any integer except for 1 and itself. The 
first several primes are 2, 3,5, 7, 11, 13, 17, 19, 23, 29, 31, .... Itis customary not 
to refer to 1 as a prime. The first prime number is 2, and 2 is the only even prime 
number. It is an old theorem of Euclid that there are in fact infinitely many primes. 

The fundamental theorem of arithmetic says that every natural number can be 
written as a product of primes (or powers of primes) in a unique way. For example, 


2520 = 2 . 32.5.7 
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And there is no other way to factor 2520 into primes. Notice that the factors 2 and 
3 are repeated—the factor 2 occurs three times and the factor 3 occurs twice. The 
factors 5 and 7 occur once only. 


9.3 Modular Arithmetic 


One of the most useful devices for studying the integers is modular arithmetic. 
We say that two integers m and n are equivalent modulo k if the number m — n is 
divisible by k. This concept is best illustrated with some examples. 


EXAMPLE 9.1 

Let k = 2. The numbers 3 and 5 are equivalent modulo 2 just because 5 — 3 = 2 
is divisible by 2. In the same way, 5 and 11 are equivalent modulo 2 because 
11 — 5 = 6 is divisible by 2. We write 


5=3 mod 2 
and 
11 =5 mod 2 


In fact it turns out that any two odd integers are equivalent modulo 2 because the 
difference of two odd integers will be an even number. 

We also note that 4 and 12 are equivalent modulo 2 because 12 — 4 = 8 is 
divisible by 2. In the same way, 24 and 48 are equivalent modulo 2 because 48 — 
24 = 24 is divisible by 2. In point of fact, any two even numbers are equivalent 
modulo 2. 














EXAMPLE 9.2 
Let k = 12. The numbers 4 and 18 are not equivalent modulo 12 because 18 — 4 = 
14 is not divisible by 12. However, the numbers 13 and 37 are equivalent modulo 
12 because 37 — 13 = 24 is divisible by 12. 

The numbers 125 and 185 are equivalent modulo 12 because 185 — 125 = 60 is 
divisible by 12. The numbers 5 and 132 are not equivalent modulo 12. 














Modularity respects the arithmetic operations. For example, if 


n=m+e 


CHAPTER 9 Number Theory 137 MO 


then 
nmodk =mmodk+ £ mod k 
Also, if 
n=m-e 
then 


n mod k = (m mod k) - (£ mod k) 


As an application of these last ideas, we can see quickly that 1347 does not divide 
25168. For if 


25168 = m - 1347 


then 
25168 mod 3 = (m mod 3) - (1347 mod 3) 
or 
1 mod3 = (m mod 3) -0 = 0 mod 3 


This is impossible. 


9.4 The Concept of a Group 


A group is a set G, or a collection of objects, together with a binary operation for 
combining them. We usually denote the binary operation by “-”. We assume that 
this binary operation satisfies certain basic and plausible properties: 


1. Associativity If g, h, k € G then g - (h - k) =(g-h)-k. 

2. Identity element There is a distinguished element e € G such that, for all 
geG,e-g=g-e=g. 

3. Multiplicative inverse For each g € G there is an element h € G such that 
g:h=h-.g=e. 


Notice that we do not assume that a group is commutative; that is, we do not 
assume that g-h =h-g for all g, h € G. The property of associativity that we 


188 Discrete Mathematics Demystified 


postulate in Axiom 1 is a different property: it says that when we are combining 
three elements we may group them, two by two, in either of the two obvious ways; 
the same answer results. A group that is commutative is called abelian in honor of 
Niels Henrik Abel (1802-1829). 


EXAMPLE 9.3 
Let G be the positive real numbers and let the group operation be multiplication: 
P(x, y) = x - y, where - is ordinary multiplication of reals. Then (G, P ) is a group. 


Axiom 1: Of course multiplication of real numbers is associative. 


Axiom 2: The number 1 is the identity element for multiplication: 1 - x = 
x - 1 = x for any real number x. 


Axiom 3: The multiplicative inverse of a group element is its ordinary recip- 
rocal. That is, if x € R satisfies x > 0 then 1/x is its multiplicative inverse. 














EXAMPLE 9.4 


Let G be the integers and let P(x, y) = x + y (ordinary addition). Then (G, P) is 
a group. 


Axiom 1: Certainly addition of integers is associative. 
Axiom 2: The number 0 is the additive identity. 


Axiom 3: The additive inverse of a group element is its negative: if m € Z 
then —m is its group inverse. 














EXAMPLE 9.5 

Let G be the k x k matrices with real entries and nonzero determinant. This is 

sometimes called the general linear group on k letters and is denoted by GL(k, R). 
Let P be ordinary matrix multiplication. Then (G, P) is a group. 


Axiom 1: Matrix multiplication is associative. 


Axiom 2: The group identity is the matrix 


10 0 0 
0 1 0 0 
Ig = k 
0 0 1 0 
00... 0 1 
———— ama 
k 


Thus, if m € G, then I -m =m I} =m. 
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Axiom 3: The multiplicative inverse of a group element is its matrix inverse. 
Thus if m € G then the inverse matrix m7! is the group inverse. 


Notice in this example that it is important to restrict attention to square matrices, 
so that multiplication of any two elements in any order will make sense. We also 
require that each matrix have nonzero determinant, so that each matrix will have an 
inverse. To see that G is closed under the group operation of matrix multiplication, 
we must note that if M, N € G then det(M - N) = (det M)(det N) Æ 0. 














Unlike the previous two examples, this last one is a noncommutative group. 

The advantage of the axiomatic method, in the present context, is that when we 
prove a proposition or theorem about “a group G,” it applies simultaneously to all 
groups. Thus the axiomatic method gives us both a way of being concise and a way 
of cutting to the heart of the matter. 


Proposition 9.1 The multiplicative identity for a group is unique. 


Proof: Let G be a group. Let e and e’ both be elements of G that satisfy Axiom 2. 
Then 





Thus e and e’ must be the same group element. 











Proposition 9.2 Let G be a group and g € G. Then there is only one multiplicative 
inverse for g. 


Proof: Suppose that h and k both satisfy the properties of the multiplicative 
inverse (Axiom 3) relative to g. Then 


h=h-e=h-(g-k)=(h-g)-k=e-k=k 


Thus / and k must be the same group element, establishing that the multiplicative 
inverse is unique. 














Proposition 9.3 Let g be an element of the group G. Then 
(gy =g 
Proof: Observe that 


and 
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Thus g satisfies the properties of the inverse element (Axiom 3) relative to g7!. 


Since the last proposition establishes that the inverse element for g is unique, 
it follows that g must be the multiplicative inverse for g~!. In other words, 


(gD =g. 














Proposition 9.4 Let g, h be elements of a group G. Then (g - h)! = h™! - g7!. 
Proof: We calculate that 
[A7 g7] -[g-h] =A" -g~ lg AN 
=h- i(87' -g)-A] 
=h! - [e-h] 
=h!.h 
=e 


A similar calculation shows that 


eag =e 





The assertion follows. 











Definition 9.1 Let G be a group and H C G. We call H a subgroup of G if the 
following properties hold 


Closure: The group operation P associated with G satisfies P : H x H > 
H. In other words, H is closed under the group operation of G (see the 
Exercises at the end of Chap. 5 for the concept of “closed”’); 


Identity element: The group identity e is an element of H; 
Multiplicative inverse: If; € H then its group inverse element h~! liesin H. 


Notice that the point of the last definition is that H is itself a group, using 
operations (and the group identity) inherited from the larger group G. 


EXAMPLE 9.6 
The pair (Q, +) of the rational numbers under the ordinary operation of addi- 
tion forms a group. Then the integers Z C Q form a subgroup. That is, the inte- 
gers are a group under the same operation of addition. They are closed under this 
operation. 
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EXAMPLE 9.7 

Let G be the 3 x 3 matrices with real entries and nonvanishing determinant 
a b c 
d e f 
g h i 


Let the group law be matrix multiplication. As we noted in Example 9.4, this is a 
group. Let H be the subset of G consisting of those matrices with nonzero entries 
on the main diagonal and zero entries off the diagonal. 


a 
0 
0 











Then H is a subgroup. 





Let G be a group and H a subgroup. Let us define a relation R on G as follows: 
(x, y) € R provided that x~!- y € H. If x is any element of G then of course 
x~!.x =e € H so the relation is reflexive. Notice that x~'y € H if and only if 
yx € H, for these elements are multiplicative inverses of each other, and H is 
a group. Thus the relation is symmetric. Finally, suppose that x, y, z € G, that x 
is related to y, and that y is related to z. Then x7! - y € H and y™! -z € H. Asa 
result, [x7! - yl: [y= -z]=x7!-zeH.Sowe may conclude that x is related to 
z, and the relation is transitive. 

Therefore we have an equivalence relation. The equivalence relation partitions 
the group G into pairwise disjoint subsets. What do these subsets look like? 

If x € G then define xH = {x -h : h € H}. The set xH is called a coset of H. 


Notice the following properties: 


(1) Elements of xH are distinct: If h, k € H then xh = xk implies x~'!(xh) = 
xT! (xk) hence h = k. So the elements of xH are distinct. 

(2) Ifa,b € xH then aRb: If a,b € xH then a = xh and b = xk for some 
h,k € H. But then 


a 'b = (xh) '(xk) =h'x !xk =h'k € H 


Thus a and b are related under R. 


(3) Ifa € xH andaRbthenb € xH: Now leta € xH and assume that (a, b) € 
R. Thus a~'b € H. So there is an element h € H such that a = xh and there 
is another element k € H such that a~'b = k. But then b = ak = (xh)k = 
x(hk) € xH. 
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It follows from (2) and (3) that the equivalence classes induced by R are prec- 
isely the cosets of H . It follows from (1) that, if H has finitely many elements, then 
each coset has the same number of elements. 

We let G/H denote the collection of cosets of H in G. With an additional condition 
on H (that H be a normal subgroup), G/H can actually be made into a group. We 
shall not explore that idea here. 


Theorem 9.1 Let G = {g1,..., g¢} be a group with finitely many elements. Let 
H CG be a subgroup with m elements. Then the integer m evenly divides the 
integer k. 


Proof: The group G partitions into the cosets of H. Each coset has m elements, 
and the cosets are of course pairwise disjoint. That means that m divides k. 


EXAMPLE 9.8 
Let a relation on the integers Z be defined by xRy if y — x is evenly divisible by 
6. This is an equivalence relation. There are six equivalence classes, namely 


Eo, E1, E2, E3, E4, Es 
Indeed, 
Eo = {..., —12, —6, 0, 6, 12,...} 
Ei ={...,—11,—5, 1,7, 13,...} 
E> = {..., —10, —4, 2, 8, 14, ...} 
E3 = {..., —9, —3,3,9,15,...} 
Ei = {..., —8, —2, 4, 10, 16,...} 
E= {...,—7,—1,5,11,17,...} 


and so forth. 
We add two equivalence classes as follows: 


Ej + Ex = Ejse 
For instance 
E3 + E4 = E7 = E; 


You should check that this notion of addition is well defined (unambiguous). Also, 
the identity element is Eo and each Em has E_,, as its additive inverse. In sum, the 
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collection of equivalence classes forms a group. This group is usually denoted by 
Z/6Z or Ze or Z/6. It is a group having six elements; we call this a group of order 
6. In general, the order of a group with finitely many elements is just the number 
of elements in the group. 

Let H be the subset of Ze consisting of Eo, E2, E4. Verify that this is a subgroup 
of order 3. Notice that 3 divides 6. The only other nontrivial subgroup is that 
consisting of Eg and E3. Notice that it has order 2. 

The number 6 has no nontrivial divisors besides 2 and 3 and the group Z/6Z has 
no other nontrivial subgroups. 














Of course there is nothing special about the number 6 in the last example. If k is 
any positive integer then we may declare that xRy if y — x is evenly divisible by k. 
The result is a group of order k denoted by Z/kZ or Z, or Z/k. If the positive integer 
m evenly divides k (k = m - p) then there will be one subgroup of order m and that 
group will consist of the elements Eo, Ep, Exp, ..., E(m—1)p. When the context is 
understood we write the elements of Z+ as 0,1,2,...,k — 1. We say that we are 
doing arithmetic modulo k or arithmetic mod k. 


EXAMPLE 9.9 
Consider the set Z4 x Z4. This set may be conveniently thought of as the set of 
ordered pairs (x, y) where x, y € Z4. Define 


@ yt yN = ax, yy’) 


where the addition is performed according to the group law of Z4. Then G = 
Z4 X Z4 so equipped is a group of order 16. 

The number 4 divides the order of G, but now there is more than one sub- 
group having order (that is, number of elements) 4. One such subgroup is 
H = {(0, 0), (1, 0), (2, 0), (3, 0)}. Another is K = {(0, 0), (0, 1), (O, 2), (0, 3)}. 
Yet another is L = {(0, 0), (2, 0), (0, 2), (2, 2)}. 














Now let G be a group of finite order. Let g be a fixed element of G. Consider 
the set H of all “powers” of g: g! = g, g? =g -8,8 =g-g-g,... as well as 
g', eg? = (g71)?, 29 = (g7!)°,.... Of course g? = e. It is easy to see that H 
is a subgroup of G. We say that H is a cyclic group (subgroup) because it consists 
of powers of the single element g. 

Let k be the order of H . Then k will be the least positive integer such that g% = e. 
Since H is a subgroup of G, we see that k = order H must evenly divide m = 
order G. It follows (provide the details as an exercise) that g” = e. This conclusion 
is so important that we display it in a theorem. 
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Theorem 9.2 Let G be a group of finite order m. If g € G then g” =e. 


The examples we have presented raise a natural question. If G is a group of 
order m and if k evenly divides m then does it follow that G has a subgroup of 
order k? In general the answer is “no.” You are requested in the exercises to provide 
a counterexample. However, the following theorem of Sylow provides a positive 
answer in a large number of important instances. 


Theorem 9.3 (Sylow) Let G be a group of finite order and let p be a prime. Suppose 
that j is a positive integer and that p/ evenly divides the order of G. Then G has 
a subgroup of order p/. Indeed, it has subgroups of all orders p*,0 < £ < j. 


Proof of a Special Case of the Sylow Theorems: Let G be a finite group and 
suppose that the prime integer p divides the order n of G. We shall show that G 
has a subgroup of order p. 

First suppose that the integer m has the property that g” = e for every g € G 
(we call m an exponent for G). Let b € G, b + e, and let H be the cyclic group 
generated by b (that is, the collection of all powers of b). It can be checked that 
G/H forms a group. Then of course b” = 1 hence m is an exponent for G/H . Thus 
the order of G/H divides a power of m. Because 


[G : e] = [G : H]-[H:e] 


we may conclude therefore that the order of G divides a power of m. 

Since p divides n, there an element x € G such that the period of x (that is, the 
minimal power of x that equals e) is divisible by p. Let the period be ps for some 
integer s. Then x“ Æ e and x° has period p. Thus x° generates a subgroup of order 
p. That is what was to be shown. 














We refer the interested reader to [LAN] or [HER] for a complete consideration 
of the Sylow theorems and their proofs. 

We close with a few remarks about the concept of isomorphism of groups. 
Consider first an example. You are familiar with the group Z2. It is a group of 
order 2. Now suppose we consider a group G with two elements: G = {e, m}. The 
element e will be the group identity and the rules of multiplication are 


e:m =m m:e =m e:e =e m:m=e 


You can check for yourself that, with this binary operation, G is indeed a group— 
that is, it satisfies the axioms for a group. 

On a formal level, the group Zz and the group G are different. One has elements 
Omod2, 1 mod2 and the other has elements e, m. But in fact it turns out that they 
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are the same group; they differ only in the sense that different names have been 
given to the elements. To see this, write out the group law for Zp: 


O+1=1 14+0=1 0+0=0 14+1=0 


Actually do this on a piece of paper. Now erase each occurrence of + and replace 
it with - . Also erase each occurrence of 0 and replace it by e and erase each 
occurrence of 1 and replace it by m. What results is the group law that we specified 
for the group G. This shows that Z2 and G are precisely the same group, with just 
different names for the elements and for the binary operation. 

We have just seen an example of group isomorphism. Now we can give a formal 
definition. 


Definition 9.2 Let G and H be groups. A function ¢ : G > H is said to bea group 
isomorphism if it has the following properties 


1. The function ¢ is one-to-one and onto. 
2. If 81, 82 € G then (gi - 82) = $ (81) - $ (82). 


We say that G and H are isomorphic and we call the function ¢ an isomorphism. 
It is the second condition of the definition that says, in effect, that both G 


and H have the same group law. Let us derive a few simple properties of group 
isomorphisms. 


Proposition 9.5 Let ¢ : G — H be an isomorphism of groups. If eg is the group 
identity in G and ey is the group identity in H then ġ (eg) = ey. 


Proof: We calculate that 
(eg) = bea : eg) = $ (ec) : b(ea) 
The expressions that appear on the left and on the far right of this equation are 


elements of H. Multiplying both sides of this equation on the right by [@(eg)]~! 
we find that 


ey = (eG) 





That is what we wished to prove. 
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Proposition 9.6 Let ¢ : G —> H be a group isomorphism. If g € G then the group 
inverse, in the group H, of $(g) is ¢ (871). 


Proof: We may check that 

$(g)- o(g"') = o(¢ 87!) = (eG) = en (by Proposition 9.5) 
Also 

o(g-')-¢(g) = (87 + 8) = p (ec) = en (by Proposition 9.5) 


Thus ¢ (g7!) possesses the defining properties of the group inverse of #(g). Since 
the group inverse of any group element is unique, our result follows. 














The theory of groups has become a large and essential part of modern 
mathematics. It is also used in physics (in quantum mechanics, for instance), in 
engineering, and in theoretical computer science (for example, data compression 
theory uses group theory). 

Itis a classical result of basic group theory that all finite abelian groups have been 
classified. Indeed, it can be shown that any such group is a product (in the sense of 
set theory) of cyclic groups. One of the triumphs of twentieth century mathematics 
is that all groups of finite order have been classified. This result is the product of the 
work of hundreds of mathematicians and will ultimately produce a book of several 
thousand pages. 


9.5 Some Theorems of Fermat 


Pierre de Fermat (1601-1665) was one of the most remarkable mathematicians 
in history. A judge in Toulouse by profession, Fermat studied mathematics as an 
amateur on his own time. Yet he became one of the most prominent mathematicians 
of his day, corresponding with many of the great professors throughout Europe. And 
he is remembered for seminal contributions to mathematics. 

One of Fermat’s basic results—which is widely used today in cryptography and 
theoretical computer science, is known as Fermat's little theorem: 


Theorem 9.4 Let p be a prime and let a be any positive integer. Then p divides 
aP—a. 
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We may state Fermat’s little theorem more succinctly, using the language of 
modular arithmetic, as 


a? mod p = a mod p 


Yet another variant is that if a is an integer that is relatively prime to p then 


-1 


a”™ = a mod p 


This last formulation is quickly checked using group theory. For the set of numbers 
relatively prime to p forms a group under multiplication, and there are p — 1 of 
them. So any number relatively prime to p raised to the power which is the order 
of the group will give the identity element, or 1. That is the result. 

One can verify Fermat’s result by using proof by induction on a. We omit the 
details, but refer the reader to [HER]. 


Exercises 


1. Calculate each of the following modular quantities: 


(a) 3+9mod5 
(b) 4-6 mod3 
(c) 9 — 4 mod 2 
(d) 2-5 mod 4 
(e) 2 + 6mod3 
(f) 4 — 9 mod 2 


2. If m and n are positive integers then explain why 
(m mod 2) - (n mod 2) = (m - n) mod 2 
3. If m and n are positive integers then explain why 
(m mod 2) + (n mod 2) = (m +n) mod 2 


4. Show that 111 and 211 are relatively prime. 
5. Find the greatest common divisor of 1024 and 100. 
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. Verify that the set of all 2 x 2 matrices, equipped with the binary operation 


of addition, forms a group. 


. Verify that the set of all 2 x 2 matrices, equipped with the binary operation 


of multiplication, does not form a group. What restriction must we make in 
order for this to be a group? 


. Verify that, in any group, 


a? b7! = b7! a7? 


. Verify that the set of all polynomials, equipped with the binary operation of 


multiplication, does not form a group. 

Use the Euclidean algorithm to find the greatest common divisor of each of 
the following pairs of positive integers. 

(a) 15 80 

(b) 24 92 

Given an example of integers n and k, with k dividing n, and a finite group 


of order n that has no subgroup of order k. [Hint: The integer k should not 
be prime. ] 





Cryptography 


10.1 Background on Alan Turing 


Alan Mathison Turing was born in 1912 in London, England. He died tragically 
in 1954 in Wilmslow, Cheshire, England. Today Turing is considered to have been 
one of the great mathematical minds of the twentieth century. He did not invent 
cryptography (as we shall see, even Julius Caesar engaged in cryptography). But 
he ushered cryptography into the modern age. The current vigorous interaction 
of cryptography with computer science owes its genesis in significant part to the 
work of Turing. Turing also played a decisive role in many of the key ideas of 
modern logic. It is arguable that Turing had the decisive ideas for inventing the 
stored program computer [although it was John von Neumann (1903-1957), an- 
other twentieth-century mathematical genius, who together with Herman Goldstine 
(1913-2004), actually carried out the ideas]. 

Turing had difficulty fitting in at the British “public schools” which he attended. 
(Note that a “public school” in Britain is what we in America would call a private 
school and vice versa.) Young Turing was more interested in pursuing his own 
thoughts than in applying himself to the dreary school tasks that were designed for 
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average students. At the Sherborne School, Turing had little patience for the tedious 
math techniques that the teachers taught. Yet he won almost every mathematics 
prize at the school. He was given poor marks in penmanship, and he struggled with 
English. 

Turing had a passion for science beginning at a very young age. He later said that 
the book Natural Wonders Every Child Should Know had had a seminal influence 
on him. When he was still quite young, he read Einstein’s papers on relativity and 
he read Arthur Eddington’s account of quantum mechanics in the book The Nature 
of the Physical World. 

In 1928, at the Sherborne School, Alan Turing became friends with Christopher 
Morcom. Now he had someone in whom he could confide, and with whom he 
could share scientific ideas and inquiries. Turing had never derived such intellectual 
companionship from either his classmates or his rather diffident school teachers. 
Sadly, Morcom died suddenly in 1930. This event had a shattering effect on the 
young Alan Turing. The loss of his companion led Turing to consider spiritual 
matters, and over time this led him to an interest in physics. 

It may be mentioned that Turing developed early on an interest in sports. He was 
a very talented athlete—almost at the Olympic level—and he particularly excelled 
in running. He maintained an interest in sports throughout his life. 

In 1931 Alan Turing entered King’s College at Cambridge University. Turing 
earned a distinguished degree at King’s in 1934, followed by a fellowship at King’s. 
In 1936 he won the Smith Prize for his work in probability theory. In particular, 
Turing was one of the independent discoverers of the Central Limit Theorem. 

In 1935 Turing took a course from Max Newman on the foundations of mathe- 
matics. Thus his scientific interests took an abrupt shift. The hot ideas of the time 
were Gédel’s incompleteness theorem—which says that virtually any mathematical 
theory will have true statements in it that cannot be proved—and (what is closely 
related) David Hilbert’s questions about decidability. 

In 1936 Alan Turing published his seminal paper “On computable numbers, with 
an application to the Entscheidungsproblem.” Here the Entscheidungsproblem is 
the fundamental question of how to decide—in a manner that can be executed by 
a machine—when a given mathematical question is provable. In this paper Turing 
first described his idea for what has now become known as the Turing machine. We 
now take a mathematical detour to talk about Turing machines. 


10.2 The Turing Machine 


A Turing machine is a device for performing effectively computable operations. 
It consists of a machine through which a bi-infinite paper tape is fed. The tape is 
divided into an infinite sequence of congruent boxes (Fig. 10.1). Each box has either 
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Figure 10.1 A Turing machine. 


a numeral 0 or a numeral 1 in it. The Turing machine has finitely many “states” 
S1, So,..., Sn. In any given state of the Turing machine, one of the boxes is being 
scanned. 

After scanning the designated box, the Turing machine does one of three things: 


1. Either it erases the numeral | that appears in the scanned box and replaces 
it with a 0, or it erases the numeral O that appears in the scanned box and 
replaces it with a 1, or it leaves the box unchanged. 

2. It moves the tape one box (or one unit) to the left or to the right. 


3. It goes from its current state S; into a new state Sz. 


It turns out that every logical procedure, every algorithm, every mathematical 
proof, every computer program can be realized as a Turing machine. The Turing 
machine is a “universal logical device.” The next section contains a simple instance 
of a Turing machine. In effect, Turing had designed a computer before technology 
had made it possible to actually build one. 


10.2.1 AN EXAMPLE OF A TURING MACHINE 


Here is an example of a Turing machine for calculating x + y: 
































Old | New Move New 
State | value | value | (Lor R) | state | Explanation 
0 1 1 R 0 Pass over x 
0 0 1 R 1 Fill gap 
1 1 1 R 1 Pass over y 
1 0 0 L 2 End of y 
2 1 0 L 3 Erase a 1 
3 1 0 L 4 Erase another 1 
4 | 1 1 | L 4 Back up 
4 | 0 0| R 5 Halt 

















If you look hard at the logic of this Turing machine, you will see that it thinks of x 
as a certain number of 1s, and it thinks of y as a certain number of 1s. It scans the x 
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units, and writes a 1 to the right of these; then it scans y units, and writes a 1 to the 
right of these. The two blocks of 1s are joined into a single block (by erasing the 
space in between) and then the two extra 1s are erased. The result is x + y. Provide 
the details of this argument as an exercise. 


10.3 More on the Life of Alan Turing 


The celebrated logician Alonzo Church (1903—1995) published a paper closely 
related to Turing’s at about the same time. As a result, Church and Turing ended up 
communicating and sharing ideas. Subsequently, in 1936, Turing went to Princeton 
for graduate study under Church’s direction. 

When Turing returned to Cambridge in 1938, he commenced work on actually 
building a computer. It was designed to be a rather crude, mechanical device, with 
a great many gears and wheels. In fact Turing had a very specific purpose in mind 
for his machine. 

One of the great mathematical problems of the day (and it is still a hot open 
problem as of this writing) was to prove the Riemann hypothesis. The Riemann 
hypothesis, posed by Bernhard Riemann in 1859, concerns the location of the zeros 
of acertain complex function (the celebrated Riemann zeta function). An affirmative 
answer to the Riemann hypothesis would tell us a great deal about the distribution 
of prime numbers and have profound consequences for number theory and for 
cryptography. 

According to Andrew Hodges (1949—_ ), the Turing biographer, 


Apparently [Turing] had decided that the Riemann Hypothesis was probably 
false, if only because such great efforts have failed to prove it. Its falsity would 
mean that the zeta function did take the value zero at some point which was 
off the special line, in which case this point could be located by brute force, 
just by calculating enough values of the zeta function. 


Turing did his own engineering work, hence he got involved in all the fine details 
of constructing this machine. He planned on 80 meshing gearwheels with weights 
attached at specific distances from their centers. The different moments of inertia 
would contribute different factors to the calculation, and the result would be the 
location of and an enumeration of the zeros of ¢. 

Visits to Turing’s apartment would find the guest greeted by heaps of gear wheels 
and axles and other junk strewn about the place. Although Turing got a good start 
cutting the gears and getting ready to assemble the machine, more pressing events 
(such as World War II) interrupted his efforts. His untimely death prevented the 
completion of the project. 
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When war broke out in 1939, Turing went to work for the Government Code 
and Cypher School at Bletchley Park. Turing played a seminal role in breaking 
German secret codes, and it has been said that his work saved more lives during 
the war than that of any other person. One of his great achievements during this 
time was the construction of the Bombe machine, a device for cracking all the 
encoded messages generated by the dreaded German Enigma machine. In fact 
Turing used ideas from abstract logic, together with some earlier contributions of 
Polish mathematicians, to design the Bombe. Turing’s important contributions to 
the war effort were recognized with the award of an O.B.E. (Order of the British 
Empire) in 1945. 

After the war Turing was invited by the National Physical Laboratory in London 
specifically to design a computer. He wrote a detailed proposal for the Automatic 
Computing Machine in 1946, and that document is in fact a discursive prospectus 
for a stored-program computer. The project that Turing proposed turned out to be 
too grandiose for practical implementation, and it was shelved. 

Turing’s interests turned to topics outside of mathematics, including neurology 
and physiology. But he maintained his passion for computers. In 1948 he accepted 
a position at the University of Manchester. There he became involved in a project, 
along with F. C. Williams and T. Kilburn, to construct a computing machine. 

In 1951 Alan Turing was elected a Fellow of the Royal Society—the highest 
honor that can be bestowed upon a British scientist. This accolade was largely in 
recognition of his work on Turing machines. 

Turing had a turbulent personal life. In 1952 he was arrested for violation of the 
British homosexuality statutes. He was convicted, and sentenced to take the drug 
oestrogen for one year. Turing subsequently rededicated himself to his scientific 
work, concentrating particularly on spinors and relatively theory. Unfortunately, 
because of his legal difficulties, Turing lost his security clearance and was labeled 
something of a “security risk.” He had continued working with the cypher school 
at Bletchley, but his loss of clearance forced that collaboration to end. These events 
had a profound and saddening effect on Alan Turing. 

Turing died in 1954 of potassium cyanide poisoning while conducting elec- 
trolysis experiments. The cyanide was found on a half-eaten apple. The police 
concluded that the death was a suicide, though people close to Turing argue that it 
was an accident. 


10.4 What Is Cryptography? 


We use Alan Turing’s contributions as a touchstone for our study of cryptography. 
Cryptography is currently a very hot field, due in part to the availability of high 
speed digital computers to carry out decryption algorithms, in part to new and 
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exciting connections between cryptography and number theory and logic, and in 
part to the need for practical coding methods both in industry and in government. 

The discussion of cryptography that appears below is inspired by the lovely 
book [KOB]. We refer the reader to that source for additional ideas and further 
reading. 

As we always do in mathematics, let us begin by introducing some terminology. 
Cryptography is the study of methods for sending text messages in disguised form 
in such a manner that only the intended recipient can remove the disguise and read 
the message. The original message that we wish to send is called the plaintext and 
the disguised message is called the ciphertext. We shall always assume that both 
our plaintext and our ciphertext are written in the standard roman alphabet (that is, 
the letters A through Z) together perhaps with some additional symbols like “blank 
space (denoted L),” “question mark (?),” and so forth. The process of translating 
a plaintext message into a ciphertext message is called encoding or enciphering 
or encrypting. The process of translating an encoded message back to a plaintext 
message is called deciphering or sometimes de-encryting. 

For convenience, we usually break up both the plaintext message and the cipher- 
text message into blocks or units of characters. We call these pieces the message 
units, but we may think of them as “words” (but they are not necessarily English 
words). Sometimes we will declare in advance that all units are just single letters, 
or perhaps pairs of letters (these are called digraphs) or sometimes triples of letters 
(called trigraphs). Other times we will let the units be of varying sizes—just as the 
words in any body of text have varying sizes. An enciphering transformation is a 
function that assigns to each plaintext unit a ciphertext unit. The deciphering trans- 
formation is the inverse mapping that recovers the plaintext unit from the ciphertext 
unit. Any setup as we have just described is called a cryptosystem. 

In general it is awkward to mathematically manipulate the letters of the alphabet. 
We have no notions of addition or multiplication on these letters. So it is convenient 
to associate to each letter a number. Then we can manipulate the numbers. For 
instance, it will be convenient to make the assignment 


A<0 
Bel 
C<2 


X = 23 
Y < 24 
Z <25 
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Thus if we see the message 
22 70 19 12 4 22 14 17 17 24 
then we can immediately translate this to 
WHATMEWORRY 
or 
WHAT ME WORRY 


Notice that, in cryptography, we generally do not worry about capital and lowercase 
letters. Everything is uppercase. Second, if we do not have a symbol for “blank 
space”, then messages are awkward to read. 

One device of which we will make frequent and consistent use is modular arith- 
metic. Recall that if n and k are an integers then n mod k is that unique integer n’ 
between 0 and k — 1 inclusive such that n — n’ is divisible by k. For example, 


13 mod5 = 3 
—23 mod7 = 5 
82 mod 14 = 12 

10 mod3 = 1 


How do we calculate these values? Look at the first of these. To determine 
13 mod5, we divide 5 into 13: Of course 5 goes into 13 with quotient 2 and 
remainder 3. It is the remainder that we seek. Thus 


13 mod5 = 3 


It is similar with the other examples. To determine 82 mod 14, divide 14 into 82. It 
goes 5 times with remainder 12. Hence 


82 mod 14 = 12 


It is convenient that modular arithmetic respects the arithmetic operations. For 
example, 


8x7=56 and 56 mod 6 = 2 
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But 
8 mod 6 = 2 and 7mod6= 1 and 2x12 


So it does not matter whether we pass to mod 6 before multiplying or after multi- 
plying. Either way we obtain the same result 2. Similar properties hold for addition 
and subtraction. One must be a bit more cautious with division, as we shall see 
below. 

We supply some further examples: 


[3 mod 5] x [8 mod 5] = 24 mod 5 = 4 
[7 mod 9] + [5 mod 9] = 12 mod9 = 3 


[4 mod 11] — [9 mod 11] = —5 mod 11 = 6 


Now we begin to learn some cryptography by way of examples. 


EXAMPLE 10.1 

We use the ordinary 26-letter roman alphabet A-Z, with the numbers 0-25 assigned 
to the letters as indicated above. Let S = {0,1,2,..., 25}. We will consider units 
consisting of single letters. Thus our cryptosystem will consist of a function f : 
S — S which assigns to each unit of plaintext a new unit of ciphertext. In particular, 
let us consider the specific example 


FP) P +5 if P < 21 
“Pao. | PS 21 
Put in other words, 
f(P) = P +5 mod 26 (10.1) 


Next let us use this cryptosystem to encode the message 
GOAWAY 
or 


GO AWAY 
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The first step is that we transliterate the letters into numbers (because, as noted 
earlier, numbers are easier to manipulate). Thus GOAWAY becomes 6 14 0 22 0 24. 

Now we apply the “shift encryption” [(Eq. 10.1)] to this sequence of numbers. 
Notice that 


f (6) = 6+ 5 mod 26 = 11 mod 26 = 11 
fA = 144+5 mod 26 = 19 
fO) =0+5 mod 26 =5 
f (22) = 22 + 5 mod 26 = 1 
f@) =0+5 mod 26 =5 
f (24) = 24 +5 mod 26 = 3 
Thus our ciphertext is 11 19 5 1 5 3. In practice, we may convert this ciphertext 
back to roman letters using our standard correspondence (A <> 0, B < 1, and so 
on). The result is LTFBFD. Thus the encryption of “GO AWAY” is “LTFBFD.” 
Notice that we have no coding for a blank space, so we ignore it. 
This is a very simple example of a cryptosystem. It is said that Julius Caesar 


used this system with 26 letters and a shift of 3. We call this encryption system a 
“shift transformation.” 














Now let us use this same cryptosystem to encode the word “BRAVO.” First, we 
translate our plaintext word to numbers: 


1 17 0 21 14 
Now we add 5 mod 26 to each numerical entry. The result is 
6 225 0 19 
Notice that the fourth entry is 0 because 
21 + 5 mod 26 = 26 mod 26 = 0 mod 26 


Thus if we wanted to send the message “BRAVO” in encrypted form, we would send 
6 22 5 0 19. We can translate the encrypted message to roman letters as “GWFAT.” 
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Conversely, we decrypt a message by subtracting 5 mod 26. Suppose, for in- 
stance, that you receive the encrypted message 


24 12 5 18 15 3 19 25 
We decrypt applying the function f~'(Q) = Q — 5 mod 26. The result is 
19 7 0 13 10 24 14 20 
This easily translates to 
THANKYOU 
or 
THANK YOU 


In a typical, real-life circumstance, you receive an encrypted message and you 
do not know the method of encryption. It is your job to figure out how to decode the 
message. We call this process breaking the code, and the science of codebreaking 
is called cryptoanalysis. 


EXAMPLE 10.2 

If the codebreaker happens to know that the message he/she has received is en- 
crypted using a shift transformation, then there is a reasonable method to proceed. 
Imagine that you receive the message 


CQNKNJCUNBOXANENA 


Looks like nonsense. But the cryptographer has reason to believe that this message 
has been encoded using a shift transformation on single letters of the 26-letter 
alphabet. It remains to find the numerical value of the shift. 

We use a method called frequency analysis. The idea of this technique is that it 
is known that “E” is the most frequently occurring letter in the English language. 
Thus we may suppose that the most frequently occuring character in the ciphertext 
is the encryption of “E” (not “E” itself). In fact we see that the character “N” occurs 
five times in the ciphertext, and that is certainly the most frequently occurring 
letter. If we hypothesize that “N” is the encryption of “E”, then we see that “4” 
has been translated to “13” in the encryption. Thus the encryption key is P —> 
P +9 mod 26. And therefore the decryption scheme is P œ> P — 9 mod 26. If this 
putative decryption scheme gives a sensible message, then it is likely the correct 
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choice (as any other decryption scheme will likely give nonsense). Let us try this 
scheme and see what result it gives. We have 
CQNKNJCUNBOXANENA 
has numerical realization 


2 16 13 10 13 9 2 20 13 1 14 23 0 13 4 13 0 
Under our decryption scheme, this translates to 


197414019 11 4 185 14 17 4 21 4 17 
which has textual realization 


THEBEATLESFOREVER 


In other words, the secret message is 
THE BEATLES FOREVER 


The trouble with the shift transformation is that it is just too simpleminded. 
It is too easy to break. There are variants that make it slightly more sophisti- 
cated. For example, suppose that the East Coast and the West Coast branches of 
National Widget Corporation cook up a system for sending secret messages back 
and forth. They will use a shift transformation, but in each week of the year they 
will use a different shift. This adds a level of complexity to the process. But the 
fact remains that, using a frequency analysis, the code can likely be broken in any 
given week. 














10.5 Encryption by Way of 
Affine Transformations 


We can add a genuine level of sophistication to the encryption process by adding 
some new mathematics. Instead of considering a simple shift of the form P bt 
P + b for some fixed integer b, we instead consider an affine transformation of the 
form P t+ aP + b. Now we are both multiplying (or dilating) the element P by 
an integer a and then translating it by b. 


10.5.1 DIVISION IN MODULAR ARITHMETIC 


There is a subtlety in the application of the affine transformation method that we 
must consider before we can look at an example. If the encryption scheme is 
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Pt Q=aP +b, then the decryption scheme must be the inverse function. In 
other words, we solve for P in terms of Q. This just involves elementary algebra, 
and we find that 


P = [1/a](Q — b) mod 26 
We see that decryption, in the context of an affine transformation, involves division 
in arithmetic modulo 26. This is a new idea, and we should look at a couple of 
simple examples before we proceed with our cryptographic considerations. 

We want to consider division modulo 26. Thus if a and b are whole numbers, 
then we want to calculate b/a and we want the answer to be another whole number 
modulo 26. This is possible only because we are cancelling multiples of 26, and 
it will only work when a has no common prime factors with 26. Let us consider 
some examples. 

First let us calculate 4/7 mod 26. What does this mean? We are dividing the 


whole number 4 by the whole number 7, and this looks like a fraction. But things 
are a bit different in modular arithmetic. We seek a number k such that 


4 
= mod 26 =k 
7 
or 
4=7-k mod 26 
or 
4—7-k is divisible by 26 
We simply try different values for k, and we find with k = 8 that 
4-— 7.8 = 4- 56 = —52 


is indeed divisible by 26. In conclusion, 
4 
= mod 26 = 8 
7 


We see the somewhat surprising conclusion that the fraction 4/7 can be realized 
as a whole number in arithmetic modulo 26. 
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Next let us try to calculate 1/4 mod 26. This is doomed to fail, because 4 and 26 
have the prime factor 2 in common. We seek an integer k such that 


1=4-k mod 26 
or in other words 
1—4k isa multiple of 26 


But of course 4k will always be even so 1 — 4k will always be odd—it cannot be 
a multiple of the even number 26. This division problem cannot be solved. 

We conclude this brief discussion with the example 2/9 mod 26. We invite the 
reader to discover that the answer is 6 mod 26. 

There is in fact a mathematical device for performing division in modular arith- 
metic. It is the classical Euclidean algorithm. This simple idea is one of the most 
powerful in all of number theory. It says this: if n and d are integers then d divides 
into n some whole number q times with some remainderr,and0 <r < d. In other 
words, 


n=d-q+r 


You have been using this idea all your life when you calculate a long division 
problem (not using a calculator, of course). We shall see in the next example that 
the Euclidean algorithm is a device for organizing information so that we can 
directly perform long division in modular arithmetic. 


EXAMPLE 10.3 
Let us calculate 1/20 in arithmetic mod 57. We apply the Euclidean algorithm to 
57 and 20. Thus we begin with 


57 =2-20+ 17 


We continue by repeatedly applying the Euclidean algorithm to divide the divisor 
by the remainder: 


20=1-17+3 
17=5-3+42 
3=1-24+1 


FO) 212 Discrete Mathematics Demystified 


Now, as previously indicated, we utilize this Euclidean algorithm information to 
organize our calculations. Begin with the last line to write 
1=3-1-2 
=3-1-(17-5-3) 
= [20 — 17] — 1- ([57 — 2 - 20] — 5 - [20 — 17]) 
= 20-8+ 17- (—6) —57 
= 20-8 + (57 — 2-20) - (—6) — 57 
= 20-20—7-57 


This calculation tells us that 1 = 20-20 mod 57. In other words, 1/20 = 
20 mod 57. 














We offer the reader the exercise of calculating 1/25 mod 64 using the Euclidean 
algorithm. 


10.5.2 INSTANCES OF THE AFFINE 
TRANSFORMATION ENCRYPTION 


EXAMPLE 10.4 
Let us encrypt the message “GO AWAY” using the affine transformation P t> 
5P +6 mod 26. As usual, 
GO AWAY has numerical realization 6 14 0 22 0 24 
Under the affine transformation, we obtain the new numerical realization 
10 24 6 12 6 22 
In roman letters, the message has become the ciphertext 


KYGMGW 


In order to decrypt the message, we must use the inverse affine transformation. 
If R = 5P + 6 mod 26, then P = [1/5](R — 6) mod 26. Using modular arithmetic, 
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we see that 10 corresponds to 
[1/5](10 — 6) = [1/5] -4 = 6 mod 26 
(because 5 - 6 mod 26 = 30 mod 26 = 4 mod 26). Likewise 24 corresponds to 
[1/5](24 — 6) = [1/5] - 18 = 14 mod 26 


(because 5 - 14 mod 26 = 70 mod 26 = 18 mod 26). We calculate the rest of the 
correspondences: 


[1/5](6 — 6) = [1/5] - 0 = 0 mod 26 
(because 5 - 0 mod 26 = 0 mod 26). Next, 
[1/5](12 — 6) = [1/5] - 6 = 22 mod 26 
(because 5 - 22 mod 26 = 110 mod 26 = 6 mod 26). Again, 
[1/5](6 — 6) = [1/5] - 0 = 0 mod 26 
And, finally, 
[1/5](22 — 6) = [1/5] - 16 = 24 mod 26 


(because 5 - 24 mod 26 = 16 mod 26). 
In sum, we have applied our decryption algorithm to recover the message 





6 14 0 22 0 24 
This transliterates to 
GOAWAY 
or 
GO AWAY 











In a real-life situation—if we were endeavoring to decrypt a message—we would 
not know in advance which affine transformation was used for the encoding. We 
now give an example to illustrate how to deal with such a situation. 
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EXAMPLE 10.5 
We continue to work with the 26-letter roman alphabet. We receive a block of 
ciphertext and wish to decode it. We notice that the most frequently occurring char- 
acter in the ciphertext is “M” and the second most frequently occurring character in 
the ciphertext is “R.” It is well known that, in ordinary English, the most commonly 
occurring letter is “E” and the second most commonly occurring letter is “T.” So 
it is natural to hypothesize that we are dealing with an affine transformation that 
assigns “E” to “M” and “T” to “R.” 

This means that we seek an affine transformation f(P) = aP +b such that 
f (4) = 12 mod 26 and f(19) = 17 mod 26. All arithmetic is, as usual, modulo 26. 
We are led then to the equations 


12 =a-4+b6 mod26 
17 =a-19+ b mod 26 
We subtract these two equations to eliminate b and obtain 
—5 =a- (—15) mod 26 
or 
a = [—5/(—15)] mod 26 
The solution is a = 9. Substituting this value into the first equation gives b = 
—24 = 2 mod 26. 
Thus our affine encoding transformation is (we hope) f(P) =9P +2. It is 


also easy to determine that the inverse (or decoding) transformation is f~'(Q) = 


[Q — 2}/9. 














You Try It: Use the affine decryption scheme in the last example to decode the 
message “ZMDEMRILMRRMZ.” 


Next we present an example in which an expanded alphabet is used. 


EXAMPLE 10.6 

Consider the standard roman alphabet of 26 characters along with the addi- 
tional characters “blank space” (denoted u), “question mark” (?), “period” (.), 
and “exclamation point” (!). So now we have 30 characters, and arithmetic will 
be module 30. As usual, we assign a positive integer to each of our characters. 
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Thus we have 


A<0 
B< 1 
C<2 


X = 23 
Y < 24 
Z <25 
U <> 26 
? <> 27 
. <> 28 
! <> 29 


Because there are now 30 different characters, we also use 30 different numerical 
codes—the numbers from 0 to 29. 

Imagine that we receive a block of ciphertext, and that we wish to decode it. We 
notice that the most commonly used characters in the ciphertext are “D” and “1”. 
It is known that the most commonly used characters in ordinary English are “xyz” 
and “E”.! If we assume that the ciphertext was encrypted with an affine transfor- 
mation, then we seek an affine mapping f (P) = aP + B such that f (U) = D and 
f(E) =!. Thus we are led to f (26) = 3 and f (4) = 29 and then to the system of 
equations 


3 = a - 26 + b mod 30 
29 =a.4+ b mod30 


As before, we subtract the equations to eliminate b. The result is 


—26 = 22a mod 30 





'We formerly said that “E” was the most commonly used letter. But that was before we added the blank space 
“U” to our alphabet. 
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This equation is equivalent (dividing by 2) to 
—13 = 11a mod 30 


Since 11 and 30 have no factors in common, we may easily find the unique 
solution a = 7. Substituting this value in the second equation gives b = 1. We 
conclude that our affine transformation is f(P) = 7P + 1. 

If the ciphertext we have received is 


21 7 29 3 14 29 12 14 7 14 19 18 29 24 
then we can apply f~'!(Q) = [Q — 1]/7 to obtain the plaintext message 
20 18 4 26 19 4 23 19 26 18 19 24 11 4 29 
This transliterates to 
USE TEXT STYLE! 
A nice feature of this example is that the spaces and the punctuation are built into 


our system of characters. Hence the translated message is quite clear, and requires 
no further massaging. 














10.6 Digraph Transformations 


Just to give an indication of how cryptographers think, we shall now consider 
digraphs. Instead of thinking of our message units as single characters, we will now 
have units that are pairs of characters. Put in other words, the plaintext message is 
broken up into two-character segments or words. (It should be stressed that these 
will not, in general, correspond to English words. Certainly words from the English 
language are generally longer than two letters. Here, when we say “word,” we 
simply mean a unit of information.) 

In case the plaintext message has an odd number of characters, then of course 
we cannot break it up evenly into units of two characters. In this instance we add 
a “dummy” character like “X” to the end of the message so that an even number 
of characters will result. Any English message will still be readable if an “X” is 
tacked on the end. 

Let K be the number of elements in our alphabet (in earlier examples, we have 
seen alphabets with 26 characters and also alphabets with 30 characters). Suppose 
now that MN is a digraph (that is, an ordered pair of characters from our alphabet). 
Let x be the numerical equivalent of M and let y be the numerical equivalent of N. 
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Then we assign to the digraph MN the number x - K + y. Roughly speaking, we 
are now working in base-K arithmetic. 


EXAMPLE 10.7 
Let us work in the familiar roman alphabet of 26 characters. A common digraph in 
English is “TH.” Notice that the numerical equivalent of “T” is 19 and the numerical 
equivalent of “H” is 7. According to our scheme, we assign to this digraph the single 
number 19 - 26+ 7 = 501. 

It is not difficult to see that each positive integer corresponds to a unique digraph. 
Consider the number 358. Then 26 divides into 358 a total of 13 times with a 
remainder of 20. We conclude that 358 corresponds to the digraph with numerical 
equivalents 13 20. This is the digraph “NU.” 














It is straightforward to see that the greatest integer that can arise in this labeling 
scheme for digraphs is for the digraph QQ, where Q is the last character in our 
alphabet. If the first character is assigned to 0 (as we have done in the past) then 
the last character is assigned to K — 1 (where K is the number of characters in the 
alphabet). The numerical labelis then (K —1)-K + (K —1)=K-K —1.Soitis 
safe to say that K? — 1 is an upper bound for numerical labels in our digraph system. 

We conclude, then, that an enciphering transformation is a function that consists 
of a rearrangement of the integers {0,1,2,...,K eee 1}. One of the simplest such 
transformations is an affine transformation on {0,1,2,..., K Da 1}. We think of 
this set of integers as Z modulo Ķ?. So the encryption has the form f (P) = 
aP +b mod K?. As usual, the integer a must have no prime factors in common 
with K? (and hence no prime factors in common with K). 


EXAMPLE 10.8 
We work as usual with the 26-letter roman alphabet. There are then 26 x 26 
digraphs, and these are enumerated by means of the integers 0,1, 2,..., 26 — 1. 
In other words, we work in arithmetic modulo 676, where of course 676 = 26°. 
The digraph “ME” has letters “M” corresponding to 12 and “E” corresponding to 
4. Thus we assign the digraph number 12 - 26 + 4 = 316 mod 676. 

If our affine enciphering transformation is f (P) = 97 - P + 230 then the digraph 
“ME” is encrypted as 97 - 316 + 230 = 462 mod 676. 

If instead we consider the digraph “EM” then we assign the integer 4 - 26 + 12 = 
116. And now the encryption is 97 - 116 + 230 = 666 mod 676. 














EXAMPLE 10.9 

Suppose that we want to break a digraphic encryption system that uses an affine 
transformation. So we need to determine a and b. This will require two pieces of 
information. 
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Let us attempt a frequency analysis. From statistical studies, it is known that the 
some of the most common digraphs are “TH,” “HE,” and “EA.” The most common 
ones that include the “blank space” character are “EU,” “Su, and “UT.” If we 
examine a good-sized block of ciphertext and notice the most commonly occurring 
digraphs, then we might suppose that those are the encryptions of “TH” or “HE” 
or “EA.” Consider for example the ciphertext (based on the 27-character alphabet 
consisting of the usual 26 letters of the roman alphabet plus the blank space, and 
numbered 0 through 26) 


XIHZYIQHRCZJSDXIDCYIQHPS 
We notice that the digraphs “XI,” “YI,” and “QH” each occur twice in the mes- 
sage. We might suppose that one of these is the encryption of “TH,” one is the 
encryption of “HE,” and one is the encryption of “EA” (although, as indicated 
above, there are other possibilities). Let us attempt to directly solve for the affine 
transformation that will decript our ciphertext. The affine transformation will have 


the form f~'(Q) = a'Q +b’ and our job is to find a’ and D’. 
To be specific, let us guess that 


TH encrypts as YI 


HE encrypts as XI 
This means that we have the numerical correspondences 
520 < 656 
and 
193 <> 629 
So we have the algebraic equations 


520 = a’ - 656+ b' mod 729 
193 =a’ -629 + b' mod 729 


Subracting the equations as usual (to eliminate b’), we see that 


327 =a’ -27 mod 729 
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Unfortunately this equation does not have a unique solution, because 27 and 729 
have prime factors in common (such as 3). 
We make another guess. Let us suppose that 


TH encrypts as QH 
HE encrypts as YI 
This means that we have the numerical correspondences 
520 <> 439 
and 
193 <> 656 


So we have the algebraic equations 


520 = a’ - 439 + b' mod 729 
193 =a’ - 656 + b' mod 729 


Subracting the equations as usual (to eliminate b’), we see that 
327 =a’ - 217 mod 729 


Now 217 and 729 have no prime factors in common, so we may solve for a’ uniquely. 
The answer is a’ = 408. Substituting into our first equation gives b’ = 13. So our 
decryption algorithm is 


f-'(Q) = 4080 + 13 (10.2) 
We apply this rule to the ciphertext 
XIHZYIQHRCZJSDXIDCYIQHPS 


For example, the digraph “XI” has numerical equivalent 629. It translates, with 
decryption rule in Eq. (10.2), to 37. This in turn corresponds to the plaintext digraph 
“BK.” We can already tell we are in trouble, because there is no word in the English 
language that contains the two letters “BK” in sequence. 

It is our job then to try all the other possible correspondences of encrypted 
digraphs*XI,” “YI,” and “QH” to the plaintext digraphs. We shall not work them 
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all out here. It turns out that the one that does the trick is 
XI is the encryption of TH 
and 
QH is the encryption of EA 


Let us try it and see that it succesfully decrypts our secret message. 
The proposed correspondences have numerical interpretation 


629 < 520 
and 

439 <> 108 
This leads to the equations 


520 = a’ - 629 + b' mod 729 
108 = a’ - 439 + b’ mod 729 


Subtracting as usual, we obtain 
412 =a’ - 190 mod 729 


Since 190 and 729 have no prime factors in common, we can certainly divide by 
190 and solve for a’. We find that a’ = 547. Substituting into the second equation 
gives b’ = 545. In conclusion, the decrypting transformation is f~'(Q) = 547Q + 
545 mod 729. 

Now we can systematically apply this affine transformation to the digraphs in 
the ciphertext and recover the original message. Let us begin: 


-1 
XI > 629 45 520 > TH 
-1 
HZ > 214 ©, 234 > IS 
The calculations continue, and the end result is the original plaintext message 


THIS HEART OR THAT HEADX 
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As you can see, an “X” is affixed to the end to force the message to have an 
even number of characters (counting blank spaces) so that the digraph method will 
work. 














One important point that the last example illustrates is that cryptography will 
always entail a certain amount of (organized) guesswork. 


10.7 RSA Encryption 
10.7.1 BASICS AND BACKGROUND 


Modern security considerations make it desirable for us to have new types of encryp- 
tion schemes. It is no longer enough to render a message so that only the intended 
recipient can read it (and outsiders cannot). In today’s complex world, and with the 
advent of high-speed digital computers, there are new demands on the technology 
of cryptography. The present section will discuss some of these considerations. 

In the old days (beginning even with Julius Caesar), it was enough to have a 
method for disguising the message that we were sending. For example, imagine 
that the alphabet is turned into numeric symbols by way of the scheme 


Am 0 
B — 1 
Cr>2 
and so forth. 
Then use an encryption like 
nt > n + 3 mod 26 (10.3) 


And now convert these numbers back to roman letters. We have discussed such 
encryption schemes in the preceding sections. 

Today life is more complex. One can imagine that there would be scenarios in 
which 


1. You wish to have a means that a minimum-wage security guard (whom you 
don’t necessarily trust) can check that people entering a facility know a 
password—but you don’t want him to know the password. 


an 
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2. You wish to have a technology that allows anyone to encrypt a message— 
using a standard, published methodology—but only someone with special 
additional information can decrypt it. 


3. You wish to have a method to be able to convince someone else that you 
can perform a procedure, or solve a problem, or prove a theorem, without 
actually revealing the details of the process. 


This may all sound rather dreamy, but in fact—thanks to the efforts and ideas 
of R. Rivest, A. Shamir, and L. Adleman—it is now possible. The so-called RSA 
encryption scheme is now widely used. For example, the e-mail messages that I 
receive on my cell phone are encrypted using RSA. Banks, secure industrial sites, 
high-tech government agences (for example, the National Security Agency), and 
many other parts of our society routinely use RSA to send messages securely. 

In this discussion, we shall describe how RSA encryption works, and we shall 
encrypt a message using the methodology. We shall describe all the mathematics 
behind RSA encryption, and shall proved the results necessary to flesh out the 
theory behind RSA. We shall also describe how to convince someone that you can 
prove the Riemann hypothesis—without revealing any details of the proof. This is 
a fascinating idea—something like convincing your mother that you have cleaned 
your room without letting her have a look at the room. But in fact the idea has 
profound and far-reaching applications. 


10.7.2 PREPARATION FOR RSA 
Background Ideas 


We now sketch the background ideas for RSA. These are all elementary ideas from 
basic mathematics. It is remarkable that these are all that are needed to make this 
profound new idea work. 


Computational Complexity 


Suppose that you have a deck of N playing cards and you toss them in the air. 
Now you want to put them back into their standard order. How many “steps” will 
this take? (We want to answer this question in such a manner that a machine could 
follow the instructions.) 

First we look through all N cards and find the first card in the ordering. Then we 
look through the remaining N — 1 cards and find the second card in the ordering. 
And so forth. 
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A typical planar graph An admissible coloring for 
the planar graph 


Figure 10.2 Coloring of a graph. 


So the reordering of the cards takes 


N(N +1) 


N+W-)+W-2++-342+15—] 


steps. Notice that this answer is a quadratic polynomial in N . Thus we say that the 
problem can be solved in polynomial time. 

We have heard a rumor that the four-color theorem is true. So we have a graph 
with N vertices and we wish to color each vertex, using either red, yellow, blue, or 
green. See Fig. 10.2, where we have used shading to suggest the coloring. The only 
rule is that two adjacent vertices (that is, vertices that are connected by a segment) 
cannot be the same color. 

Of course the number of possible colorings is the number of functions from the 
set with N objects to the set with 4 objects. That is 4”. The machine, being as dumb 
as it is, will simply try all the possible colorings until it finds one that works. Thus 
we see that the number of steps is now an exponential function of N. 

We call this an exponential time problem. 

Another interesting exponential time problem is that of scheduling planes for 
an airline. If you have n cities and k planes and you take into account different 
populations, different demands, crew availability, fuel availability, and other 
factors, then it is easy to convince yourself that this is a problem of exponential 
complexity. The theory of linear programming can be used to reduce many 
problems of this kind to polynomial complexity. Linear programming is routinely 
used by the airlines for this purpose. 
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Certainly one of the most famous exponential time problems is the “traveling 
salesman problem.” The issue here is that a certain salesman wants to visit each of 
n cities precisely once. What is his most efficient path? It is not difficult to discern 
that there are exponentially many possible paths, and no evident strategy for picking 
one in any efficient manner. 


Modular Arithmetic 


This is a familiar idea, and we have already alluded to it earlier. The “right” way 
to define the idea is with cosets, but we shall content ourselves here with a more 
informal definition. 

When we write n mod k we mean simply the remainder when n is divided by k. 
Thus 


25 = 1 mod3 
15 = 3 mod4 
—13 = —3 mod5 = 2 mod5 


It is an important fact—which again is most clearly seen using the theory of 
cosets—that modular arithmetic respects sums and products. That is, 


a+bmodn=amodn+bmodn and a-bmodn = (amodn)- (bmodn) 


We shall use these facts in a decisive manner below. 


Fermat’s Theorem 


Let a and b be two (positive) integers. We say that a and b are relatively prime if 
they have no common prime factors. For example, 


72 =23 .3? 
175 =5°-7 


hence 72 and 175 are relatively prime. 
If n is an integer, let P(n) be the set of integers less than n that are relatively 
prime to it. Let (n) be the number of elements in P(n). 


Theorem 10.1 Zf n is a positive integer and k is relatively prime to n then 


k?™ = 1 modn 
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Proof: The proof of this result is easy. For the collection P(n) of numbers 
relatively prime to n forms a group under multiplication. That is, if a is relatively 
prime to n and b is relatively prime to n then logic dictates that a - b is relatively 
prime to n. Now it is a fundamental fact—we cannot prove it here, but see 
[BMS ]—that if a group has m elements and g is an element of the group then g” 
is the group identity. Thus any element of the group, raised to the power (n) (the 
number of elements in the group) will equal 1 modulo n. 














For later use, it is worth noting that if p, q are prime numbers and n = p - q then 


p(n) =(p—1)-@—D. 

The reason is that the only numbers less than or equal to n that are not relatively 
prime ton are p,2p,3p,...q-pandgq, 2q,3q,---(p—1)q. 

There are g numbers in the first list and p — 1 numbers in the second list. The 
set P(n) of numbers relatively prime to n is the complement of these two lists, and 
it therefore has 


pa-q-(p-l=pq-q-pt+l=(p-1)-@-l=¢) 


elements. 


Relatively Prime Integers 


Two integers a and b are relatively prime if they have no prime factors in common. 
As noted above, for example, 72 and 175 are relatively prime. 

It is a fundamental fact of elementary number theory that if a, b are relatively 
prime then we can find other integers x and y such that 


ta+yb=1 (10.4) 


For example, we have noted that a = 72 and b = 175 are relatively prime. The 
corresponding integers x, y are x = —17 and y = 7. Thus 


(—17)-72+7-175=1 


One can prove this result using Fermat’s theorem above. For, since b is relatively 
prime to a, thus 


b? = 1 moda 
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But this just says that 
bo —l=k-a 


for some integer k. Unraveling this gives Eq. (10.4). 
In practice, one finds x and y using the Euclidean algorithm (otherwise known 
as long division). 
In the example of 72, 175, one calculates: 
175 =2-724+31 
72=2-31+10 
31=3-10+1 


You know you are finished when the remainder is 1. 
For now we have 


1=31-3.-10 

= 31—3-(72—2-31) 
=7-31—3-72 
=7-(175 —2-72) —3-72 
=7-175—17-72 


That is the decomposition we seek. 


10.7.3 THE RSA SYSTEM ENUNCIATED 
Description of RSA 


Now we can quickly and efficiently describe how to implement the RSA encryption 
system, and we can explain how it works. 

Imagine that George W. Bush has an important message that he wishes to send 
to Donald Rumsfeld. Of course Rumsfeld is a highly placed man of many respon- 
sibilities, and you can imagine that Bush’s message is quite secret. So he wants to 
encode the message: 


Your time is up. Hasta la vista, baby. 


So Bush goes to the library and finds the RSA encryption book. This is a readily 
available book that anyone can access. It is not secret. A typical page in the book 
reads like this: 
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Name Value of n Value of e 


Puck, Wolfgang 4431...7765 8894 ... 4453 
Rehnquist, William 6668 ...2345 1234 ...9876 
Riddle, Nelson 7586 ...2390 4637 ...4389 
Rin Tin-Tin 5355 ...5353 5465 ... 7648 
Rogers, Roy 7859 ...4359 3058 ...2934 


Roosevelt, Theodore 7835 ...2523 7893 ...4232 
Rotten, Johnny 3955 ... 4343 4488 ...9922 
Roy, Rob 3796 ...5441 2219...3319 
Rumsfeld, Donald 1117 ...8854 9266...2388 
Russert, Tim 6464 ...4646 3223 ...3232 
Schwarzenegger, Arnold 6894 ...3242 7525 ...2314 
Simpson, Orenthal James 6678 ...2234 4856 ...2223 





What does this information mean? Of course we know, thanks to Euclid, that 
there are infinitely many primes. So we can find prime numbers with as many digits 
as we wish. Each number n in the RSA encryption book is the product of two 75- 
digit primes p and q: Thus n = p - q. Each number e is chosen to be a number with 
at least 100 digits that is relatively prime to (n) = (p — 1) - (q — 1). Of course 
we do not publish the prime factorization of the number n; we also do not publish 
y(n). All that we publish is n and e for each individual. 

Now an important point to understand is that Bush does not need to understand 
any mathematics or any of the theory of RSA encryption in order to encode his 
message. (Well, it would be nice if he understood modular arithmetic. But he is, 
after all, the President of the United States.) All he does is this: 


1. First he breaks the message into units of five letters. We call these “words”, 
even though they may not be English language words. 


For the message from Bush to Rumsefeld, the “words” would be 


YOURT IMEIS UPHAS TALAV ISTAB ABY 


2. He transliterates each “word” into a sequence of numerical digits, using our 
usual scheme of translation. 


3. Then he encodes each transliterated word w with the rule 


wt> w° modn 
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Bush will send to Rumsfeld this sequence of encrypted words. That is all there 
is to it. 
The real question now is: 


What does it take to decrypt the encoded message? 
How can Rumsfeld read the message? 


This is where some mathematics comes into the picture. We must use Fermat’s 
theorem, and we must use our ideas about relatively prime integers. But the short 
answer to the question is this. If w is a word encrypted according to the simple 
scheme described above, then we decrypt it with this algorithm: 


We find integers x and y so that xe + yg(n) = 1 and then we calculate 
w* modn 


That will give the decrypted word w with which we began. [We shall provide 
the mathematical details of this assertion in the next section.] Since w has only five 
characters, and n has 150 characters, we know that w mod n = w—so there is no 
ambiguity arising from modular arithmetic. We can translate w back into roman 
characters, and we recover our message. 

Now here is the most important point in our development thus far: 


In order to encrypt a message, we need only look up n and e in the pub- 
lic record RSA encryption book. But, in order to decrypt the message, we 
must know x. Calculating x necessitates knowing (n), and that necessitates 
knowing the prime factorization of n. 


It is a theorem that calculating the prime factorization of an integer with k digits 
is a problem of exponential complexity in k. For an integer with 150 digits, using a 
reasonably fast computer, it would take several years to find the prime factorization.” 


10.7.4 THE RSA ENCRYPTION SYSTEM EXPLICATED 
Explanation of RSA 


In fact, with all the preliminary setup that we have in place, it is a simple matter to 
explain the RSA encryption system. 





2One might note that—if your message is just five letters long—then there are only 26° possible encryptions. 
You could decrypt the message by trial and error. Because of considerations like this, we often find it convenient 
to append a 50-digit random integer to the message. This technique is discussed in detail near the end of this 
discussion. 
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For suppose that we selected an n = p -q and ane relatively prime to (n) = 
(p — 1)- (q — 1) corresponding to a particular person listed in the RSA encryption 
book. If we are the certified decrypter, then we know the prime factorization of 
n—that is, we know that n = p -q for p and q prime. 

We therefore know g(n) = p:q—p—q4+1= (p-— 1)-(q— 1) and so we 
can calculate the x and the y in the identity xe + yp(n) = 1. Once we know x, then 
we know everything. For 


w~“ modn = [w°]* modn 
= w®™ modn 
= wl? modn 
=w-[w?”]-” modn 
=w : 1” modn 
= w modn 


= Ww 


since w is certainly relatively prime to n. 
This shows how we recover the original word w from the encrypted word w = 
w° modn. 


10.7.5 ZERO KNOWLEDGE PROOFS 
How to Keep a Secret 


We shall now give a quick and dirty description of how to convince someone that 
you can prove Proposition A without revealing any details of the proof of 
Proposition A. The idea comes across most clearly if we deal again with colorings 
of graphs. 

So suppose once again that we are given a graph. See Fig. 10.3. We are the 
prover, and there is a remotely located verifier. 

It is our job to convince the verifier that we know how to four-color this graph. 
But we do not want the verifier to actually know how to color the graph. We only 
want him to be convinced that we know how to do it. 

We begin by transmitting the adjacency matrix to the verifier. This data is exhib- 
ited in Table 10.1. 

This transmission is straightforward, and need not be encoded. We simply tell 
the verifier: “In position (1, 1) of the matrix there is an x;” “In position (1, 2) of the 
matrix there is an x;” In position (1, 5) of the matrix there is no x.” And so forth. 
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Figure 10.3 Zero knowledge coloring of a graph. 


Now we number the colors 1, 2, 3, and 4. Here 


1 < blue 
2< red 
3 < yellow 


4 <> green 


Finally, imagine that our coloring of the graph is as in Fig. 10.4. Of course we 
have used shading to suggest the coloring. We wish to communicate to the verifier 
that we have a valid coloring—in such a manner that he can check it, but he cannot 
learn any of the details of the coloring. 

As we see from Figure 10.4, the coloring is encoded as 


12 23 34 41 51 64 73 








Table 10.1 

123 4 5 6 7 
1}]}x x x x x 
2)]x x x 
3] x x x 
4]x x x 
5 x x x x 
6 x x x 
7|x x X x 
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Figure 10.4 Coding of the coloring. 


99 66 


This is read as “node 1 is colored with color 2 (red),” “node 2 is colored with color 
3 (yellow),” and so on. We will transmit these pairs of digits, suitably encoded, to 
the verifier. 

The trouble is that the verifier already knows that there are only seven nodes in 
the graph, and only four colors, so he could (with a little effort) figure out what 
color has been assigned to what node just with a little trial and error—even though 
the information has been encoded. So this will not do. 

Thus, instead of encoding and sending 12 and 23 and 34 and so forth, instead 
the prover encodes and sends 


12 Fi 
23 r2 
34 r3 


and so on 


where r1, r2, r3 are 50-digit random integers. 
More precisely, the step-by-step scenario is this: 


1. The prover sends the entire coloring to the verifier, in encoded form as indi- 
cated above. 

2. The verifier stares at his adjacency matrix. He notices that, for example, 
vertices 4 and 6 are adjacent. And he inquires specifically about those two 
vertices. 
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3. The prover sends the verifier the colorings for those two particular vertices 
(with a 50-digit random integer attached to each, just as before). 


4. The verifier encrypts the information for those two vertices—using the 
pre-agreed-upon public key encryption system. He then checks that those 
two vertex colorings match colorings in the full coloring of the graph that 
was sent in Step 1. 


Now the verifier has checked that one pair of adjacent vertices is suitably 
colored (that is, with different colors). If he wants to perform further verifications, 
then the preceding steps are repeated. Except that first the prover assigns numbers 
to the four colors red, yellow, blue, and green in some new random fashion. And 
he chooses an entirely new set of random 50-digit integers. Then he sends the 
entire colored graph, gets a query from the verifier, and so forth. 

If there are n nodes to be colored then there are n(n — 1)/2 possible pairs of 
nodes. The probability that the prover lied about the coloring and that the verifier— 
in asking for the coloring of a particular pair of nodes—failed to catch the lie is 
1/[n(n — 1)]. If the entire process is iterated again, then the probability that the 
verifier failed to catch the lie is 1/[n?(n — 1)°]. And so forth. Thus each succes- 
sive verification increases the likelihood that the verifier may be certain of his 
check. 

The point of this procedure is that the verifier can check that any pair of adjacent 
vertices is colored correctly, that no two adjacent vertices are colored the same, 
but he cannot amalgamate the information and produce the entire coloring of the 
graph. 

Of course the example we have presented is for graph coloring, just because that 
is simple to describe. But any proof whatsoever can be translated into binary code 
and then rendered as a statement about the coloring of some graph. So in fact the 
example we have given is perfectly general. 


Concluding Remarks 


The RSA encryption scheme is one of the great ideas of modern coding theory. It 
is being developed and enhanced even as we speak. There are versions for multiple 
verifiers, for dishonest provers, and many other variants. The history of RSA is 
a remarkable one. There was a talk at the 1986 International Congress of Math- 
ematicians in Berkeley about the method. After that, the government attempted 
to co-opt the method, retract all the preprints, and suppress the information. In- 
terestingly, it was the National Security Agency (the branch of the government 
in Washington that specializes in cryptography) that stepped in and prevented the 
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government intervention. Now RSA is in the public domain, and anyone can use it. 
It is a powerful tool. 


Exercises 


1. Use the shift encryption system given by P +» P — 3 to encrypt the message 
BYE BYE, BIRDIE 


2. Use the shift decription scheme Pt» P — 12 to decrypt the code 
EAXAZSNMNK. 


3. Use a frequency analysis on the ciphertext ZRRGZRURER to determine the 
shift encryption scheme. Then decrypt the message. 


4. Use the affine encryption system given by P +> 3P +11 to encrypt the 
message 


HELLO MY HONEY 


5. Use the affine decryption scheme P > [P —3]/7 to decrypt the code 
RDQYPHZYDQYP. 


6. Break the message 
THIS WAS NOT THE END 
up into two-character digraphs. Now tranlate each digraph into a pair of 


numbers, and then encrypt each digraph according to the rule P œ> 3P +7. 
Now translate back to anew encrypted word expressed with roman characters. 


7. Consider the message 
NOW IS THE TIME 


Transliterate this to a list of numerals, one character at a time, in the usual 
way. Now apply the encryption algorithm 


P+ 5P* + P mod 26 


What ciphertext results? 
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Boolean Algebra 


11.1 Description of Boolean Algebra 
11.1.1 A SYSTEM OF ENCODING INFORMATION 


Boolean algebra, named after George Boole (1815—1864), is a formal system for 
encoding certain relationships that occur in many different logical systems. As an 
example, consider the algebra of sets, equipped with the operations N, U, and ‘( ). 
These are “intersection,” “union,” and “complementation.” Propositional logic has 
three analogous operations: A, V, and ~. If we think of these as corresponding, 
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then we find that the two logical systems have very similar formulas: 


O [(SUT)=*SN°T] <> 
[~ (AV B) =~ Adr~B] 
(ii) ESAT) =" SUT] <> 
[~ (A ^A B) =~ Av ~B] 
(iii) [SU (T NU) =(SUT)N(SUU)] <> 
[Av (B AC) — (AVB)A(AVC)] 
(iv) [SA(TUU) =(SNT)U(SNU)] <> 
[AA (B vC) = (AAB)V(AAC)] 


Other logical systems, such as the theory of gates in computer logic, or the theory 
of digital circuits in the basic theory of electricity, satisfy analogous properties. 
Boolean algebra abstracts and unifies all of these ideas into a single algebraic system. 

The Stone’s representation theorem [JOH] states that every boolean algebra can 
be realized as the algebra of closed-and-open sets on a compact, zero-dimensional 
topological space. This unifies several different areas of mathematics and has proved 
to be a powerful point of view. 


11.2 Axioms of Boolean Algebra 
11.2.1 BOOLEAN ALGEBRA PRIMITIVES 


Boolean algebra contains these primitive elements: a collection, or set, S of objects. 
At a minimum, S will contain the particular elements 0 and 1. Boolean algebra also 
contains three operations (two binary and one unary): +, x, and —. Boolean algebra 
uses the equal sign = and parentheses (, ) in the customary manner. In boolean 
algebra, just as in fuzzy set theory, we think of the overbar as denoting set comple- 
mentation (although it could have other specific meanings in particular contexts). 


11.2.2 AXIOMATIC THEORY OF BOOLEAN ALGEBRA 


The axioms for boolean algebra, using elements a, b, c € S, are these: 


l.axl=a 
2.a+0=a 
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3.axb=bxa 
4,.a+b=b+a 

5. ax (b+c) =(ax b)+(a xc) 
6. a+(bxc)=(at+b) x (atc) 
7.axa=l 

8 axa=0 

9. a+a=1 


Some of these axioms (1, 2, 3, 4, 5) have a familiar form that we have seen 
in ordinary arithmetic. The other four (6, 7, 8, 9) are not familiar, and may be 
counterintuitive. In fact Axioms (6, 7, 8, 9) are false in ordinary arithmetic. So 
Peano’s arithmetic is not a model for boolean algebra. But the algebra of sets is a 
model for boolean algebra. If we let S be the collection of all sets of real numbers, 
and if we interpret + as union (U), x as intersection (N), — as set-theoretic 
complementation (°()), 0 as the empty set (Ø), and 1 as the universal set (in this 
model, the real numbers R), then in fact all nine axioms now hold. As an example, 


SNS =O 
is the correct interpretation of Axiom (8), and it is now true. Also 
SU(TOU)=(SUT)N(SUU) 


is the correct interpretation of Axiom (6), and it is now true. Similar statements 
may be made about the interpretation of the axioms of boolean algebra in the 
propositional calculus. 


11.2.3 BOOLEAN ALGEBRA INTERPRETATIONS 


Obversely, we can also give the boolean algebra interpretations of the four sets of 
equivalent statements that we gave in Sec. 11.1.1. These are 


G) @+b)=axb 
Gi) (@xb)=a+b 
Gi) a + (b x c) = (a x b)+ (a x c) 
(iv) ax (b+c)=(axb)+(axc) 
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11.3 Theorems in Boolean Algebra 
11.3.1 PROPERTIES OF BOOLEAN ALGEBRA 


One of the remarkable features of boolean algebra is that it has a very small set of 
axioms, yet many additional desirable properties are readily derived. Some of these 
properties are: 








l.axa=a 

2,.a+a=a 

3.ax0=0 

4,.a+1=1 

5.041 

6. O is unique and 1 is unique 
7.a=a 

8. a+ (b+c)=(a+b)+c 
9. ax (bxc)=(axb)xc 
10. axb)=a+b 

11. @+b)=axb 

12. 0=1 

13. 1=0 

14. a x (a +b)=a 

15. a+ (a xb)=a 

16. ax (ax b) =0 
17.a+(a+b)=1 

18.ax (@xb) =0 
19.a+(@+b)=1 

20. Ifaxc=bxcanda+c=b+c,thena=b 


11.3.2 A SAMPLE PROOF 


As an illustration of the ideas, we now provide a proof of formula 2. The proof con- 
sists of a sequence of statements, each justified by one of the nine axioms of boolean 
algebra or by the rules of logic. At each step, we cite the relevant axiom or rule. 


1. [Axiom (2)] a+0=a 
2. [Axiom (4)] a=a+0 
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3. [Axiom ()]axa=0 

4. [Axiom (3)] 0=a xa 

5. [Steps 2 and 4] a =a + (a x a) 

6. [Axiom (6)] a = (a +a) x (a +a) 

7. [Axiom (8)] a = (a +a) x 1 

8. [Axiom (1) + Step 4 + Step 6] a =a +a 
9. [Symmetry of equality] a + a = a 


Some of the elementary theorems of boolean algebra that we have cited require 
rather elaborate justifications. For instance, it is rather difficult to prove the 
associativity of addition. We refer the reader to [NIS] for the details. 


11.4 Illustration of the Use of Boolean Logic 


We now present an example adapted from one that is presented in [NIS, p. 41]. 
Imagine an alarm system for the monitoring of hospital patients. There are four 
factors (inputs/outputs) that contribute to the triggering of the alarm: 




















Inputs/Outputs | Meaning | 

a Patient’s temperature is in the 
range 36—40°C. | 

b Patient’s systolic blood pressure is 
outside the range 80-160 mm 

c Patient’s pulse rate is outside the 
range 60—120 beats per minute | 

o Raising the alarm is necessary | 








Good sense dictates that we would want the alarm to sound if any of the following 
situations obtains: 


e The patient’s temperature is outside the acceptable range, the blood pressure 
is outside the acceptable range, and the pulse rate is outside the acceptable 
range. 


e The patient’s temperature is in the acceptable range, but the pulse rate is 
outside the acceptable range. 

e The patient’s temperature is in the acceptable range, but the systolic pressure 
is outside the acceptable range. 

e The patient’s temperature is in the acceptable range, but both the systolic 
pressure and the pulse rate are outside the acceptable range. 
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11.4.1 BOOLEAN ALGEBRA ANALYSIS 


Using boolean algebra, these four conditions can be encoded as 


e axbxc 
e axbxc 
eaxbxc 


e axbxc 


Notice that, since we know that multiplication is associative, we have omitted using 
parentheses to group these binary operations. No ambiguity results. 

The aggregate of all the situations in which we want the alarm to sound can be 
represented by the equation 


o = [a x b x c] + [a x b x c] + [a x b x T] + [a xb x c] (11.1) 
Now we can use the laws of boolean algebra to simplify the right-hand side: 


[a x b x c]+[a xb x c]+[a xb xT]+ [a x bx c] 
= [a x b x c] + [a x b x c] + [a x b x T] 
+([a x b x c] + [a x b x c] + [a x b x c]) 
= (fax b x c] + [a x b x c]) 
+([a x b x c] + [a x b x c]) 
+([a x b xT] + [a x b x c]) 
= (la +a] x [b x c]) 
+([b + b] x [a x c]) 
+([¢ + c] x [a x b]) 
= (1 x [b x c])+ (1 x [a x c]) + (1 x [a x b]) 


= [b x c] + [a x c] + [a x b] 
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In summary, we have reduced our protocol for the sounding of the medical 
alarm to: 


o = |b x c] + [a x c] + [a x b] (11.2) 


The circuit that we put in place can be designed after Eq. (11.2) rather than after 
the much more complicated Eq. (11.1). 


Exercises 


1. As much as possible, simplify the boolean expression 





[a xb xc]+[a xb x c]+ [a xb xT] + [a xb xT] 
2. Prove the boolean identity 
ax(a+b)=a 
3. Prove the boolean identity 
ax (axb)=0 


4. Prove the boolean identity 








(a+b)=axb 
5. Prove the boolean identity 

(ax b)=a+b 
6. Prove the boolean identity 

a+(axb)=a 


7. Prove that if a x c = b x canda +c = b + c, then a = b. 
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CHAPTER 12 





Sequences 


12.1 Introductory Remarks 


Many physical and mathematical quantities are best understood by using approxi- 
mations. Sometimes approximating values are given as limits. For instance, suppose 
that we want to paint the (infinitely many) squares shown in Fig. 12.1. Suppose 
also that 1 gallon of paint covers 500 square feet. How much paint will we need? 

We see that the first square has interior area 900 square feet. The first two squares 
have area 900 + 90 = 990 square feet. The first three squares have area 900 + 90 + 
9 = 999 square feet. The first four squares have area 900 + 90 + 9 + 0.9 = 999.9 
square feet. The pattern is clear. We have a list of approximations, each given by a 
sum of finitely many numbers. The approximations seem to tend to 1000 . Therefore 
it seems that we will need enough paint to cover 1000 square feet, or 2 gallons of 
paint ([we are of course conveniently ignoring the fact that the (100!™)th square 
will be too small to hold even one molecule of paint)]. 

The purpose of this chapter is to turn the above intuitive discussion into careful 
mathematics. A list of approximating values will be called a “sequence.” We want 
especially to study the situation when the sequence consists of “partial sums” (as in 
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Figure 12.1 Calculating the area of a floor. 


the paint example)—this is called a “series.” Thus the process we will be performing 
corresponds, as in the example of the squares, to summing infinitely many numbers. 


12.2 Infinite Sequences of Real Numbers 


A sequence is an ordered list of numbers. It is most common to write a sequence as 
di, 42, 43,... 


or sometimes as 

{aj} z 
EXAMPLE 12.1 
Discuss the sequence 2, 1, 4, 3, 6, 5,.... 


Solution: We see that 


a =2, a@=1, a=4, a=3, a=6, aę=5,... 


In fact the rule that generates the sequence assigns to each positive odd integer 
the next integer and to each positive even integer the preceding integer. Check 
this assertion: f (1) = 2, f (2) = 1, f (3) = 4, f (4) = 3, f (5) = 6, f (6) = 5, and 


so on. 
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Remark 12.1 It would be a mistake to think that every sequence is given by a “rule.” 
Far from it. But many sequences do come from rules, and it is always interesting 
to determine what that rule is. 


EXAMPLE 12.2 
How does the sequence 


1,2,3,4,... 
differ from the sequence in Example 12.1? 
Solution: This new sequence has the same values as the sequence in Example 


12.1. But they occur in a different order. Since a sequence is by definition an ordered 
list, we conclude that this is a different sequence from that in Example 12.1. 














Insight: Occasionally it will prove useful to begin a sequence with an index different 
from 1. An example is 


(37 —5}%4 
This denotes the sequence 7, 10, 13, 16,.... 
Generally, the main reason for studying a sequence is to understand whether or 


not it “tends to some limit.” Before giving a precise definition of limit, let us apply 
our intuition to some examples. 


EXAMPLE 12.3 
Discuss whether the sequence 


ee ot) 
aj=2 
which can also be written as 
(27, 
tends to a limiting value. 


Solution: If we write the sequence out as 


NI = 
Ale 
oo| = 
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— ooe o o ~ | p 


0 


Figure 12.2 The limit of a sequence. 


then we see that the terms become (and remain) arbitrarily close to zero. It seems 
plausible to say that the sequence tends to zero. This intuitive notion is displayed 
in Fig. 12.2. 














EXAMPLE 12.4 
Does the sequence a; = (—1)/ tend to a limit? 


Solution: We may write this sequence out as 
—1,1,-1,1,... 


The sequence does not seem to tend to any limit: half of the time the value is 
1 and the other half of the time the value is —1. The sequence does not become 
and remain close to a single value. Therefore we say that it has no limit. Refer to 
Fig. 12.3. 














EXAMPLE 12.5 
Does the sequence a; = j° tend to a limit? 


Solution: This sequence may be written out as 
1,8, 27, 64,... 


The sequence takes values which are larger and larger, without any bound. The 
sequence tends to no limit. Refer to Fig. 12.4. 














Figure 12.3 A sequence with no limit. 
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OP 


Figure 12.4 A sequence that grows without bound and hence has no limit. 


EXAMPLE 12.6 

A quantity of radioactive material decays. At the beginning of each week there is 
half as much as there was the previous week. The initial quantity is 5 grams. Use 
sequence notation to express the amount of material at the beginning of the jth week. 


Solution: The amount of radioactive material at the beginning of the second week 
is 5/2 (half as much as the initial amount at the beginning of the first week). The 
amount at the beginning of the third week is 5/4. The amount at the beginning of 
the fourth week is 5/8. And so forth. 

As a result, according to the description, the amount of material at the start of 


the jth week is 
1 j-l 
Pe pe 
oN) 


The sequence exhibits in an elegant way the process of radioactive decay: the first 
several values are 


5, 


oo| U 


5 
g 


NIU 


It is easy to see intuitively, or with a calculator, that the amount of radioactive 
material tends to 0 as time tends to oo. 














EXAMPLE 12.7 
Discuss convergence for the sequence 1 , 1/2, 1/3, 1/4,.... 


Solution: Each term is smaller than the last. All the terms are positive. Most 
importantly, the terms are getting arbitrarily close to zero: the hundredth term is 
within 1/100 of 0, the thousandth term is within 1/1000 of 0, the millionth term 
is within 1/1000000 of 0. We conclude that the sequence converges, and its limit 
is 0. 














EXAMPLE 12.8 
Let a; = c for every j. What is the limit of this constant sequence? 
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Solution: The terms of this sequence are all the same. They do not deviate. They 
stay at c—never above c and never below c. They are arbitrarily close to c because 
in fact they all equal c. We conclude that the sequence converges, and its limit 
is c. 














EXAMPLE 12.9 
Find the limit of the sequence 


Solution: It is helpful to use long division and write 


eee 
aj = 

Now the numerator in the fraction on the far right is plainly the constant value 
—2, while the denominator increases without bound. We conclude that the quotient 
gets arbitarily small as j — +00. Thus the sequence converges, and the limit is 
2+0=0. 














Insight: Another way to analyze the limit of a; in Example 12.9 is to divide the 
numerator and denominator of the expression for a; by j. This yields 


2 
li = TF 
jj 





Then it is apparent that a; > 2/[1 + 0] = 2. 


EXAMPLE 12.10 
Find the limit of the sequence 


10 + 8/ 
AT 


Solution: If we rewrite aj as 
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then we notice that the expression in parentheses is positive and less than 1. Higher 
and higher power of this number get smaller and smaller. In fact this entire expres- 
sion (a number less than one raised to the jth power) tends to 0. Thus the limit of 
the sequence a; exists and equals 1. 














12.2.1 SEQUENCES WITH AND WITHOUT PATTERNS 


Recall that in Example 12.7 we considered the sequence a; = 1/j. The pattern (or 
rule) for this sequence is a simple one. The rule for the sequence in Example 12.4 
is less obvious: f (j) = (—1)/. Sometimes a sequence will come from an obvious 
pattern or rule, and sometimes not. 


EXAMPLE 12.11 
What is the next element of the sequence 


6,6, 1,7, 10, 2,5, 3, 2, 5,3 


Solution: There is no single answer to this question. Given any finite sequence 
of real numbers, there are infinitely many “patterns”or “rules” which will generate 
them. The rule which we used was to look at the sentence immediately preceding 
Example 12.3 in this section of the book. The first word has six letters, the second 
six letters, the third one letter, and so on. The twelfth element of the sequence (the 
one requested above) is 9 since the twelfth word of the sentence has nine letters. The 
fifteenth element of the sequence is 8. Since the sentence only has fifteen words, 
we declare all subsequent elements of the sequence to be 0. 














Most of the sequences which we encounter in practice come from some mathe- 
matical pattern, although the pattern may be subtle. 


EXAMPLE 12.12 
Find the pattern in the sequence 


and find the limit. 


Solution: If the sequence is rewritten as 





ighi eee Sai 
1? 2’ 3° 4’ 


rt 
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or 
pei a aye ye ay a ea 
1 2 3 4 
then we see that the j'" term is given by 


oma SE 
aj =14+(-1)/*?. = 











Of course 1/j tends to 0. Therefore lim j_,.. a; = 1. 





12.3 The Tail of a Sequence 
Intuitively, we say that {a;}°, converges to £ if a; is as close as we please to £ 
when j is large enough. A crucial feature of this idea is that convergence depends 
only on what {a aa does when j is large. The values of the first 10,000 or so ajs 
are irrelevant. 


EXAMPLE 12.13 
Find the limit of the sequence defined by 


O ifl< j< 10,000 
4i = |10 ifj > 10,001 
Solution: The sequence {a pea converges to 10. Because when j is large (j > 
10, 000), then a; = 10. Far enough out, the sequence is simply constant. It has 
constant value and limit equal to 10. 














12.4 A Basic Theorem 


Now we want to consider some arithmetic properties of sequences. 
Theorem 12.1 Suppose that {a py and {b pe , are convergent sequences. Then 
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qj _ lim j—0 4j 
bj lim joo bj 





3. lim joo provided that limj..~.b; #0 
4. limjo0@-aj; =a@-limj..a; for any real number a 


5. lim joo 4j is unique 


Insight: In what follows we will generally not mention Part 5 explicitly, but we 
use it frequently. It means that if we calculate a limit by any particular method, the 
answer will be the same as if some other method were used instead: there is no 
ambiguity in the theory of limits. 


EXAMPLE 12.14 
Compute 


3 
lim (5 i i) 
J—>œ J 


Solution: We apply the theorem several times as follows: 





(Part 1) lim j> (3 y 1) = lim joo 3 + lim joo 1 
(Part 4) =3. ling F + lim joo 1 
(Example 12.7) =3-0+limjo1 
(Example 12.8) =1 











Insight: In the beginning, you should work through problems step-by-step, carefully 
noting each time you apply one of the parts of Theorem 12.1. After a while, this 
meticulousness will be unnecessary and you will be able to do many problems in 
your head. 


EXAMPLE 12.15 
Compute 


fed 
1m 
j>% 3/72 — 5 





Solution: We systematically apply Theorem 1 to find that the limit equals 


limj>o(l/j + 1) 


lim j00(3/j? — 5) 
_ lim joo 14 + lim joo 1 
Lim joo 3/7? — lim) 00 5 


(Part 3) 





(Part 1) 
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(Parts 2, 4) = - AERE : : 
3 - Lim j—oo I + lim joo 1/7 — limj—oo 5 
0+1 
E les 12.7, 12.8 = — 
(Examples ) 3005 
aul 
Ss 
EXAMPLE 12.16 
Use Theorem 12.1 to analyze the limit 
_ p—-4j+6 
im ———— 
jroo 373 +2] 


Solution: Notice that the limit cannot be handled directly using Theorem 12.1 
since both the numerator and the denominator become ever larger with j (they tend 
to no limit). It is not entirely clear that the limit exists! The approach we take is to 
use some algebra before we attempt to apply Theorem 12.1. 

Dividing numerator and denominator by the largest power of j that appears 
(namely, j°), we find that our limit equals 


ee es ae ky ae 
lim - 
j>%  3+2j? 





= limj>æll = 4j? + 6j) 


Part 3 
Eaa lim; +03 + 2j~) 





(Parts 1,4) _ limj>o 1 = 4: limjoo j7? + 6: lim joo J 





lim j.00 3 + 2 limjoo j7? 
However, using Part 1 of Theorem 12.1 and Example 12.7, we see that 


lim j7? = lim j-!- lim j-'=0-0=0 


jroo Jr-ow jroo 
A similar calculation shows that 


lim j° =0 


J> œ 
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We conclude that our limit equals 


1-4-0+6-0_ 1 
342-0 3 

















From now on, in order to facilitate the flow of ideas, we will not continue to cite 
each use of Theorem 1. But as you read the examples you should check for yourself 
why each step is correct. 


You Try It: Analyze the sequence aj = ~ j/V] + 51. 


12.5 The Pinching Theorem 


We turn now to a “pinching theorem” for sequences. This tool gives us a means 
for comparing a new, perhaps difficult and mysterious, sequence with other more 
familiar sequences. 








Theorem 12.2 (The Pinching Theorem) Suppose that {a; ie p {bj}21, fe ae 1 
are sequences. If 





a; = DFS Cj for every j 
and if 
lim aj = lim c; =£ 
j> œ J>% 
then 
lim b; = ¢ 
EXAMPLE 12.17 
Evaluate 
_ (sin j? 
lim 


FO) 254 Discrete Mathematics Demystified 


Solution: Let b; = a Then 


To apply the pinching theorem, let 


1 
aj =0 and G= for every j 
l . j 
We observe that 
lim a; = lim c; =0 


The hypotheses of the pinching theorem are satisfied and we may conclude that 





(sin j)? 








lim = lim b; =0 
jroo J [rw 








12.6 Some Special Sequences 


A number of special sequences occur repeatedly in our work. We now collect several 
of them for easy reference. 


Theorem 12.3 Let s be a real number. 


1. Ifs < 0 then the sequence (Pz converges to 0. 

2. Ifs > 0 then the sequence (Pz diverges. 

3. If s =0 then the sequence (Pj is just the constant sequence 1,1,1,... 
and converges to 1. 


Theorem 12.4 is a companion to Theorem 12.3. In Theorem 12.3 the j’s are the 
base of the exponential, while in Theorem 12.4 we have j playing the role of the 
exponent. 


Theorem 12.4 Let t be a real number. 


1. If |t| < 1 then ee ee converges to 0. 
2. If |t| > 1 then AL diverges. 
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3. Ift = 1 then Cap is the constant sequence 1,1, 1,... which converges 
to 1. 


4. Ift = —1 then {t/}52) diverges. 


EXAMPLE 12.18 
Calculate 


pe +42/? 
lim —W~—— 
j7o 4j? 


Solution: We use some algebraic manipulations to rewrite our limit as 


-3/2 972 
lim — + lim —~ 
1 
= —.- lim —- + lim — 


4 jroo pp j>œ 2 


By Theorem 12.3 this last line equals 

















1 Oe 1 = 1 
4 2 2 
EXAMPLE 12.19 
Calculate 
ee eed 
lim - 
j—> œ 4J 


Solution: We multiply the numerator and denominator by 4~/ and simplify to 
obtain that our limit equals 





| VAS +34 O (UDI + 3/4)! 
lim 3 g = lim ——— 
jroo 4J .4-J jroo 1 


1\/ 3 J 
li = li = 
ee (3) go G) 


0+0=0 


Now by Theorem 12.4(1) this equals 
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Exercises 


In Exercises 1 and 2, use your intuition to determine whether the given sequence 
converges. If it does, say what the limit is. Do not attempt to supply proofs. Use 
your calculator if you wish to gather data. 





_ 1 
l. aj = Pal 


ae 
2. aj = pti 
In Exercises 3 and 4, determine intuitively whether the limit exists and what the 
limit is. Then verify your answer rigorously using the precise definition of limit. 
: 1 
3) lim j— 00 j+7 
4. limj+oo 1077 


It is important for you to realize that there are many sequences whose convergence 
properties are not at all obvious. Each of the sequences in Exercises 5 and 6 con- 
verges, but it is quite tricky to determine what the limit is and to prove the answer. 
Use a calculator to help you guess what the limit is in each problem, but do not 
worry for now about proving your answer rigorously. 


5. limj—oo j - sin(1//) 
6. limjo0 j°- (1 — cos(1/)) 


In Exercises 7 and 8, compute the limit using the rules which you learned in 
Theorems 12.1, 12.3, and 12.4. Note explicitly each time that you use a rule. 


Use the pinching theorem to evaluate the limits in Exercises 9 and 10. 
sin(1/j) 
J 


10. lim j-+c0 Y 





In Exercises 11, compute the limit using the rules which you learned in this chapter. 
Note explicitly each time that you use a rule. 


11. limj—.0(27/ + 3) 


CHAPTER 13 





Series 


13.1 Fundamental Ideas 


The idea of adding together finitely many numbers is a simple and familiar one. 
We are so used to the associative and commutative properties of finite sums that 
we tend not to think about them. Now we shall learn to add together infinitely many 
numbers, and we will have to be a bit more careful. As Example 13.1 suggests, 
contradictions can arise if we do not establish certain ground rules. 


EXAMPLE 13.1 
Give an example of an infinite summation process in which the ordinary rules of 
arithmetic appear to fail. 


Solution: Consider the infinite sum 
1-—14+1-14+1-1+1-1+--- 
If we group the terms as 


(-)+0-)+0-)+0-)+4+-- 


258 Discrete Mathematics Demystified 


then the sum seems to be 
0+0+0+4+0+--:-: 
which ought to be 0. On the other hand, if we group the terms as 
1+(-1+1)4+(-1+1)+C¢1+4+1)+-::: 
then the sum seems to be 
1+0+0+4+0+--- 


which ought to be 1. 
Apparently the associative law of addition fails for this infinite sum. 














It turns out that, while the associative law is valid for finite sums, it is not valid 
in general for infinite sums. In order to make sense of this and other addition 
operations, we must first consider precisely what it means to “add” infinitely many 
numbers. 

We begin by recalling the following notation from Sec. 6.4: 


Definition 13.1 The symbol 


N 
À ) cj 


j=M 
denotes the sum of the numbers cy, Cm+1, . - -, CN—1, Cn. In other words 
N 
> Cj = CM + Cm+1 +++ + CN- + CEN 
j=M 


EXAMPLE 13.2 
If cj = 277 then calculate ee, cj and YS Cj. 


Solution: We have 


2 15 
Sieg = 271427427424 = 
j=l fp 
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and 
: 3 4 5 6 7 31 
Sicp =27°42442942%42 = is 
j=3 
Definition 13.2 If c1, c2, c3, ... are real numbers and Do , Cj the corresponding 


series, then we write Sy to denote the sum of the first N of them: 
N 
Sv = oe 
j=l 


We call Sy the Nth partial sum of the series D Cj. 
If 


N=>œ 


then we say that the infinite sum (or series) Di ı Cj converges to £. If the sequence 
{Sn}; does not converge then we say that the series diverges. 


Insight: The idea in this definition is as follows: to add up infinitely many numbers 
Cy +¢2 + c3 +---weadd up a large finite number of the terms. That is, we consider 
the partial sums 


Sy = C1 + C2 +03 +: +C 


In the case that the series converges, this partial sum should give us a good approx- 
imation to the full sum )°~ , Cj- 

In practice, when we write 
OO 
Se, 


j=l 


we mean 


lim Sn 
N->o 





provided that this limit exists. 
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13.2 Some Examples 


EXAMPLE 13.3 
Does the series Di 2~/ converge? If so, to what limit? 


Solution: We have 


N 
Sy = 024 21427425 4---42% 


j=l 








m 14 (2 Dag 3+2 25 ae 3_9 i) 
i e (2 ANDIN (13.1) 


Notice that Sy is a finite sum so that it is correct to use the associative law of 
addition. We conclude that 


Sy = 27! +2! Go ho ae) 





tees + (2 Waly age (N Dy _9 N 
=1- 27N 
We may now pass to the limit: 


lim Sy = lim (1 —2-%) =1- lim 2™ =1-0=1 
N->oo N>0oo N->0oo 


By definition, we say that the series 
OO 
B 


j=l 











converges to 1. Refer to Fig. 13.1. 





9 a E O SA 


0 1 


Figure 13.1 Convergence of the series ` j 2-4, 
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Insight: Do not worry about how one thinks of the trick that was used in 
Example 13.3. The purpose of this chapter is to teach you all the techniques that 
you will need in order to study series. We will learn them gradually, beginning in 
the next section. For now, just concentrate on learning what it means for a series to 
converge or diverge. 


EXAMPLE 13.4 
Discuss convergence for the series 


eS . 
X GDH =1-1+1-1+1-- 
j=1 


Solution: If N is odd then 


Sy =1-14+1—-1+----+1 
=1-)+d-)D+---+d-1+1 
=1 


However, if N is even then 


Sn=1-1+1-1+----—1 
=0-)+0-)++0-1) 
=0 


Therefore the sequence {Sy})_, of partial sums is just the sequence 
1,0,1,0,1,0,... 


which does not converge. 
We conclude that the series itself does not converge. 














Insight: Example 13.4 explains away the apparent contradiction that we en- 
countered in Example 13.1. The lesson is that we should not attempt to perform 
arithmetic operations on series that do not converge. 


WARNING: Itis easy to become confused at this point about the difference between 
a sequence and a series. Remember that a sequence is a list of numbers while a 
series is a sum of numbers. However, we study a series by looking at its sequence 
of partial sums. 
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Sometimes there are tricks involved in seeing that a series converges: 


EXAMPLE 13.5 
Discuss convergence for the series 


~ 1 
j GD 
Solution: If you use a calculator to compute some partial sums (up to S29, for 


instance) then you will probably be convinced that the series converges. Here is a 
way to get this conclusion mathematically: We write 


1 1 1 1 
Sy = eee eet 
NSS aa ak t ND) 


(a) aa) ae) 


Almost everything cancels out and we have 








1 


Sy = 1-—— 
N N +1 


Clearly limy—.. Sy = 1. Therefore the series 


D AT GED 


j=l J: 














converges to 1. 


Insight: Sometimes it is convenient to begin a series at an index other than j = 1. 
An example is 


5 j 
& in(j = 1) 


Obviously it would not do to begin this series at j = 1 because In0 is undefined. 
We cannot begin at j = 2 because In 1 = 0. So we begin at j = 3. Of course this 
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series is equal to 


: + t + 
ln2 In3 
But notice this: the partial sum S, is understood to be 0 because summing from 


3 to 1 makes no sense. The partial sum Sù is zero for a similar reason. The first 
nontrivial partial sum is 


3 
S3 = — 
> n2 
Indeed, for N > 3 we have 
Sy = 3 + ss +-+ x 
N n2” In3 In(N — 1) 


EXAMPLE 13.6 
What are Ss, S10, and S4 for the series 


2 
j=5 


Solution: The interest of this example is that the series does not begin at j = 1. 
We have 


S; = 35 


Sio = 35 +36 +37 +38 +3? 4 3° 














e 
Il 
(= 


13.3 The Harmonic Series 


Example 13.5 was devious, but it introduced a useful technique. In the next example 
we consider a series that diverges in a subtle way. This is the important harmonic 
series. 
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EXAMPLE 13.7 
Discuss convergence or divergence for the series 


CO 
= 


Sel = 


Solution: If you calculate a large number of (for instance several hundred) partial 
sums using a computer then you might never suspect that this series diverges. It 
turns out that the sum of the first one million terms of this series is just under 14. 
The divergence is taking place so slowly that it is difficult to be certain what is 
actually happening. We need a mathematical proof: 

Notice that 








ya 
2 
1 3 
2 3 4 
>1+3+(3+7) 2143+353 
z 2 4 4) ~ 2 de 2 
salts4(545)+(S+e+5 +5) 
2 3 4 5 6 7 8 
z14s4(54a)+(G+atsts) 
+ 2 4 4 8 8 8 8 
5 
ao 


In general this argument shows that 


ie ees 


The sequence of Sjy’s is increasing since the series contains only positive terms. 
The fact that the partial sums S1, $2, S4, Sg, . . . increase without bound shows that 
the entire sequence of partial sums must increase without bound. We conclude that 
the series diverges. 
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Insight: The series ye ; comes up often in mathematics and physics. It is called 
the “harmonic series” because of its role in the theory of acoustics: if the natural 


frequencies at which a given body vibrates are p1, p2, ... and if 
Seer ee 
Pj J +1 


for all j then the frequencies are said to form a sequence of harmonics. 


13.4 Series of Powers 


We now consider series of powers of a fixed number. These are called geometric 
series. 


EXAMPLE 13.8 
Discuss convergence for the series } `% 4 107. 


Solution: Notice that 


Sı =0.1 

S.=0.11 
S3 = 0.111 
and so on 


We see that the partial sums form an increasing sequence that is bounded above by 
1. It is plausible—and there is a mathematical theorem that makes this assertion 
rigorous—that such a sequence must accumulate at some point less than or equal 
to 1. Thus the series converges. (We shall treat repeating decimals in more detail in 
the next section.) To what number does the series converge? 

Referring back to the material in Sec. 5.4 on the real number system, we see that 
the sequence of partial sums {Sy }i;_, converges to the real number given by the 
infinite decimal expansion 


1 
0.11111... = — 
9 














Therefore the series X7, 1077 converges to 1/9 . 
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EXAMPLE 13.9 
Discuss convergence for the series aus 177. 


Solution: Notice that 
Sy < $2 < S83 <- 


since all of the summands are positive. Also 





Sy = 177! +17 17? H 17N 
D ae eg TEE o 
=1-2-% 
<2 


(We know this from the calculation that we did in Example 13.3). Therefore the 
sequence of partial sums S1, $2, $3, ... is increasing and bounded above by 2. As 
noted before, when a sequence increases and is bounded above, it must accumulate 
at some point. Therefore our partial sums have a limit £. We conclude that the series 
converges. 














Insight: Notice in the last example that we demonstrated that the series converges 
without actually finding the sum. This is the way matters will often turn out in our 
study of series. 


13.5 Repeating Decimals 


In this section we begin to understand infinitely repeating decimals. The way we 
do this is that we learn to convert them to rational fractions. 


EXAMPLE 13.10 
Consider the real number given by the repeating decimal expansion 


x = 2.713131313... 
We often find it convenient to write a number like this as 


x = 2.713 
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where the overbar indicates that the decimal digits under it are repeated indefinitely. 
This number x is a legitimate real number, but we would like to write it in a more 
compact (or perhaps more understandable) form. 
The device that we use is to compare 100x with x. Thus we have 
100x = 271.3131313... 
x=  2.7131313... 


Clearly the multiple of 100 was chosen so that the infinite repetions of 13 line up 
nicely in our array. Subtracting, we find that 


99x = 268.6 


or 


268.6 2686 
y= —S = —— 
99 990 


As a result of our calculation, we have expressed the infinitely repeating decimal 
x as a rational fraction. 














You Try It: Use the technique just presented to convert the repeating decimal 
0.11111... 
to a rational fraction. Of course you should obtain the answer 1/9. 


EXAMPLE 13.11 
Let us convert the repeating decimal 


y = 37.12712712... = 37.12712 


to a rational fraction. 
We write 


1000y = 37127.127127127... 
y= 37.127127127... 


Subtracting, we find that 


999y = 37090 
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or 


37090 


999 














It is an interesting fact that decimal expansions that repeat—such as the ones 
we considered in the last two examples—always give rise to rational numbers (that 
is, quotients of integers). Decimal expansions that do not repeat always give rise 
to irrational numbers (that is, numbers that cannot be expressed as quotients of 
integers). 


13.6 An Application 


In this section we illustrate some of our ideas about series with an example from 
the physical world. 


EXAMPLE 13.12 

A rocket uses a great deal of fuel during liftoff. Once the rocket leaves the earth’s 
gravitational field, relatively little fuel is required to keep the rocket in motion. 
Scientists estimate that, for a certain class of rocket, the vehicle uses 1/4 of its total 
fuel supply during its first hour of flight, 1/8 of its fuel supply during the second 
hour, 1/12 of its fuel supply during the third hour, and so on. A newspaper reporter 
hears this information and reports that the rocket’s fuel supply will “last forever.” 
Why is the reporter in error? 


Solution: The portion of the total fuel supply that the rocket has consumed after 
N hours is 


Oe ee a ae 
4° 8° 12 4N 


E 
4 i 


j= 


Thus 
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But we learned in Example 13.7 that the expression in parentheses becomes large 
without bound as N increases. In particular, Sy > 1 when N is large enough, so 
the rocket will eventually exhaust its fuel supply (in fact a calculation shows that 
the fuel will be exhausted during the thirty-first hour of flight). 














13.7 A Basic Test for Convergence 


Theorem 13.1 (The zero test) If the series eee cj converges then the summands 
cj tend to Q. 


Proof: Since $`% cj converges, the partial sums Sy tend to some limit £. Let 
€ > 0. Choose No such thatif N > Nothen|Sy — £| < €/2. Then also |Sy+1 — £| < 
€/2. Therefore 


cn+1 — O| = [enyi = [Sny — Syl 
= |(Sw41 — €) + (€— Sn)| 
< |\Sn-+t — £| + |£ _ Syl 


a 
< — a. 
2.2 
=€ 





This says that cy — 0. 











EXAMPLE 13.13 
Apply the zero test to the series } >% (~ 1V. 


Solution: The summands c; = (—1)/ do not tend to zero therefore the series 
diverges. 














EXAMPLE 13.14 
Apply the zero test to the series $` 


00. 
J= j 
Solution: The terms cj = 1/j certainly tend to 0. But we cannot conclude that 
the series converges (that is not what Theorem 13.1 says). In fact we proved in the 
last section that the series diverges. 














EXAMPLE 13.15 
Apply the zero test to the series D 17. 
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Solution: The terms c; = 17~/ certainly tend to zero. So the series might con- 
verge, but the zero test will not tell us. In Example 13.9, we in fact learned that the 
series does converge. 














13.8 Basic Properties of Series 


We now present some elementary properties of series that are similar to the prop- 
erties of sequences given in Theorem 12.1. 


Theorem 13.2 Suppose that A cj converges to C and Di d; converges to 
D. Then 

1. 0% (cj + dj) converges to C D. 

2: Doei ac; converges to a -C, where a@ is any real constant. 


Corollary 13.1 fy b j diverges and a is any nonzero constant then Do a-b; 
diverges. 


Proof of Corollary 13.1: Seeking a contradiction, we suppose that Do a:b; 
converges. By Part (2) of the theorem, 


2 


[o,@) 
j=l j 


-a-bj) =) bj 


Q |= 
S 
uR 





also converges. That is a contradiction. 











Here are some examples which indicate how Theorem 13.2 and its corollary can 
be applied. 


EXAMPLE 13.16 
Discuss convergence for the series 


(13.2) 


x| s 


Il 
ah 


J 


Solution: We know by Example 13.3 that 


Il 
un 


iMe 
Spa 
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converges. But then Part (2) of Theorem 13.2 applies to tell us that the series in 
Eq. (13.2) converges. 














EXAMPLE 13.17 
Does the series 


3 2/ + 10/ 
‘= 20/ 
converge? 


Solution: We rewrite the jth summand as 





2i4+10/ 1 pa 
20) 105 2j 


Because 


converges (Example 13.3) and 


se 


j=l 


—_ 
a 


converges (Example 13.8), we may apply Part (1) of Theorem 13.2 to conclude that 


2 1 1 
(ata) 


j=l 














converges. Therefore the original series converges. 


EXAMPLE 13.18 
Does the series 





D (13.3) 


converge? 
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Solution: Suppose that the series does converge. We also know that 


Sr] 
2z 


converges (by Example 13.3). Therefore Part (1) of Theorem 13.2 tells us that 
= ( j+2 1 ) 
= j: 2j i 

converges. 


Performing the subtraction, we may conclude that 


converges. But that is false (Example 13.7). Therefore our assumption is contra- 
dicted and the original series [Eq. (13.3)] diverges. 





ie 














EXAMPLE 13.19 
Discuss convergence of the series 


CO 
Joun = 1074+) 
j=l 
Solution: Rewrite the series as 
es h 3 
$128 - 27} — 1000 - 1075) 
j=l 
Now 
CO 
yz? 
j=l 


is convergent therefore Part (2) of Theorem 13.2 tells us that 


x 128 -275 (13.4) 
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converges. Similarly, 
Sa . 
Sio 
j=l 
converges hence we may conclude that 
OS . 
S > 1000 - 107 (13.5) 
j=l 


converges. 
Finally, Eqs. (13.4) and (13.5), together with Part (1) of Theorem 13.2, tell us 
that the original series converges. 














13.9 Geometric Series 


We will next discuss how to sum explicitly a special kind of series called the 
geometric series. (We already began exploring these in the last section.) 

Let 4 be a fixed real number and consider the series Eco àÏ. We find it conve- 
nient here to begin the sum with the index j = 0 rather than j = 1. By convention, 
à? = 1. Then 


S= 1AA aA 
Therefore 
Ae Sy = ÀF PA peA NTH 
= (1 +A +A? + HANH ANT = 1) 
= Sy ee Nt! —] 
We reorganize the equation so that the terms involving Sy are together: 
Sn —1)=A^! -1 
As long as à Æ 1, we may divide by (A — 1) to obtain 


J —)Nt1 
Sy = — 13.6 
v= (13.6) 
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This formula for Sy tells us everything that we could want to know about the 
geometric series )77" 9 A/. 


Theorem 13.3 Jf |A| < 1 then the geometric series X` j=0 à} converges to the sum 
1/1 — A). If |A| > 1 then the geometric series Žo d/ does not converge. 


Proof: Remember what it means for a series to converge: the sequence of partial 
sums Sy converges to some limit. If |A| < 1 then by (13.6) we see that 


That is the first statement of the theorem. 
If |A| > 1 then the terms of the series do not tend to zero so that, by the zero test, 
the series cannot converge. 














EXAMPLE 13.20 
Discuss convergence for the series 


2 


j=0 
Solution: We rewrite the series as 
% /4\ i 
>) 
which converges to 
1 3 
1- (13) 2 











(because |1/3| < 1). 





EXAMPLE 13.21 
Discuss convergence for the series 


X2 
j=0 


Solution: Since | — 2| > 1, the series must diverge. 
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EXAMPLE 13.22 
Sum the series 


S 6 (13.7) 


explicitly. 


Solution: We begin by rewriting the series as 


GA (13.8) 
k=0 


Why can we do this? Write out the first several terms of both Eqs. (13.7) and (13.8) 
to see that they are simply two different way of writing 


6+6 +6 t 
Now we may use Theorem 13.2(1) of the last section to rewrite (13.8) as 
4 ae ao i 
64. 2 GS 6%: D (<) 


Finally, 





y(t) = l -$ 
2 G AEN S 


0 


We conclude that 














a a Se 
a 5 1080 


EXAMPLE 13.23 
Sum the series 


explicitly. 
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Solution: We rewrite the series as 


CO 9 j+10 = alo XO j 
> 7I P Daa 


J 





By Theorem 13.3, 





eT 1 
» (5) TOM) 


j=0 


Therefore 


X 2i+4 1024 7 1024 


73 343 5 245 

















Insight: Whether or not a given series converges does not depend on the first million 
or so terms. For suppose that we are given the series D , 4; and that we are able 
to ascertain that 


OO 

2a 

j=106 
converges. If N > 10° then this means that the partial sums 

N 
SN = y dj 
j=106 
converge (as a sequence) to a limit £. But then the expressions 
N 10°—1 


N 108-1 
Ty = ) aj = ) aj + ) aj = ) aj + Sy 
j=l j=l j=l 


j=106 
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converge as N — œ to 
10°-1 


SS aj tel 
j=l 


(Notice that the first sum does not depend on N.) What is Ty? It is the partial sum 
for the full series 


0O 
2a 
j=l 
In other words, we have concluded that the full series converges. To summarize: 
(oe) lo) 
iD j—106 4j converges then 5 j=1 4j converges. 
Conversely, 
If X% a; converges then }°°° 1s aj converges 
j=l Fj 8 j=106 Îj ges. 


The reason is that the two sums differ only by the finite number 


10°—1 














aj 
j=l 
EXAMPLE 13.24 
Does the series 
& r 
Xay 
j=75 


converge? 


Solution: The series must diverge. For if it converged, then it would follow that 
Ecol. D converges. That, of course, is false since |1.1| > 1 














You Try It: Discuss convergence for the series $` j (0.9)7/. 
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EXAMPLE 13.25 
Does the series 


2 (0.9)/ 


j=20 
converge? 


Solution: Yes. For 
50.9) 
j=0 


converges (because |0.9| < 1) and the two series differ by just the finite sum 











0.9? 40:08 40,07 fact HO? 





EXAMPLE 13.26 
Calculate the sum 


20 
yD 5.4) 
j=l 


explicitly. 


Solution: Notice that the limits of summation are not what we are used to; in 
particular, the lower limit is not 0 . We rewrite our sum as 


20 6 
5. (S i 5a) 
j=0 j=0 


Now we may use the formula for partial sums of geometric series which appears 
in line (13.6). We find that the sum equals 


: 1-4! 1-4! 

1-4 1-4 
Of course this solution can be simplified, if necessary, with a little further 
calculation. 
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13.9.1 AN APPLICATION 


EXAMPLE 13.27 

At a certain aluminum-recycling plant, the recycling process turns n pounds of 
used aluminum into 9n/10 pounds of new, virgin aluminum. How much usable 
aluminum will 100 pounds of virgin aluminum ultimately create, if we assume that 
it is continually returned to the same recycling plant? 


Solution: We begin with 100 pounds of aluminum. When it is returned to the plant 
as scrap and recycled, 0.9 - 100 = 90 pounds of new aluminum results. When that 
aluminum is returned to the plant as scrap and recycled, 0.9 - 90 = 81 pounds of new 
aluminum results. This process continues forever. The total amount of aluminum 
created is therefore equal to 


100+ 904 81+4+--- 
= 100-14 100-0.9+ 100-0.9-0.9+--- 
= 10011 +0.9+0.97 + ---) 


= 100. 50.9)! 
j=0 

ST EE 
1-09 

= 1000 


Therefore 1000 pounds of usable aluminum is ultimately generated by an initial 
100 pounds of virgin aluminum. 














13.10 Convergence of p-Series 


We frequently encounter series of the form 


cag 
a 
j=l 


JP 


for some fixed exponent p. Examples are 


oe) 


bp - and 


j=l J as, 


l= 
Me 
| = 
o 
i=) 
Q 
IL 8 
S 
Se} = 
TR 
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The second of these series is the harmonic series, and is known to diverge (see Sec. 
13.3). We shall learn that the first series converges and the third diverges. 

A useful tool for studying series of this kind is the Cauchy condensation test, 
which we formulate as follows: 


Theorem 13.4 Let a, > a > a3 > --- be positive numbers that tend to 0. The 
series 


Soa; (13.9) 
j=l 


converges if and only if the series 
2 . 
X Zay (13.10) 
jal 


converges. 


This result is best understood by examining two pictures. 

In Fig. 13.2, we see that each box corresponds to (half of) a term of the condensed 
series (13.10). For instance, the first box has height a, and width 1 = 2°; the second 
box has height a and width 2 = 2!; the third box has height a4 and width 4 = 27; 
and so forth. Thus any partial sum for the original series (13.9) is dominated by 











Figure 13.2 The condensed series dominates the original series. 
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Figure 13.3 The original series dominates the condensed series. 


the corresponding partial sum for the condensed series (13.10). We conclude that 
if the condensed series converges then the original series converges. 

Now look at Fig. 13.3. Now the boxes lie below the graph of the {a;} instead of 
above. The first box has height az and width 1 = 2°; the second box has height a4 
and width 2 = 2!; the third box has height ag and width 4 = 2?; and so forth. Now 
we see that half any partial sum for the condensed series (13.10) is dominated by 
twice the corresponding partial sum for the original series (13.9). We conclude that 
if the original series converges then the condensed series converges. 

Thus we now have a new tool for determining the convergence of series. Let us 
learn from some examples. 


EXAMPLE 13.28 
Determine convergence for the series 


Solution: The summands form a decreasing sequence of positive numbers that 
tends to zero. So we may apply the Cauchy condensation test. We examine 
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Of course this is a geometric series that we have studied before. It converges, hence 
the original series converges. 














EXAMPLE 13.29 

Determine convergence for the series 
2 
a 


Solution: We of course apply the Cauchy condensation test. Thus we examine 
the series 


oe) 


5o72 ani = 2 ; TE = ae 


j=l 





This is a geometric series. But the terms do not tend to zero, so the series cannot 
converge. We conclude that the original series also does not converge. 














EXAMPLE 13.30 
Examine the harmonic series using the Cauchy condensation test. 


Solution: The harmonic series is 


| = 


[0,0] 
j=l 


J 
Therefore we must examine the series 


a (=o 


j=l 


a. 


Of course this series diverges, so the harmonic series diverges (as we already know) 
as well. 














In our ensuing discussions, we shall refer to a series of the form 


eI 


as a p-series. 
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13.11 The Comparison Test 


13.11.1 A NEW CONVERGENCE TEST 
Suppose that 


[0,0] 
2a 
j=l 


is a convergent series of nonnegative terms: 


Because the partial sums form an increasing sequence, these partial sums must 
increase to £. Therefore for each partial sum Sy we may say that 


N 
Soa; = Sy <£ 
j=l 


If 
Co 
Ye 


j=l 


is another series satisfying 0 < cj < a; for every j, then the partial sums Ty for 
this series satisfy 


N N 
Ty = ) Cj S ) aj et 
j=l JH 


Thus the partial sums Ty of the series D cj are increasing (since the c ;’s are non- 

negative) and bounded above by £. By a property of bounded increasing sequences 

that we have discussed before, we conclude that the sequence of Ty’s converges. 
We summarize: 


Theorem 13.5 (The comparison test for convergence) Let 0 < cj < a; for every 
j. If the series ja a; converges then the series Doa cj also converges. 
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EXAMPLE 13.31 
Show that the series 


2 j: G? +D 
converges. 
Solution: Observe that 
+12)? 


hence 
jg? +D)? 


We conclude that 


1 1 
— <— 
PAY j? 





for every j. Also 


1 
G3 


2 


CO 
j=l J 


is aconvergent p-series. By the comparison test for convergence, the original series 
~ 1 


LUT PeD 


j=l 














converges. 


EXAMPLE 13.32 
Discuss convergence for the series 


œ 1 
2, Gj- 2) 


j=l 
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Solution: Notice that 
33 -2=jf+Qj-22] 
for all j > 1. Therefore 
Ce we i 


so that 


Finally, 


converges by Example 13.28. By the comparison test for convergence, 


Lae Gj = z 


j=l 











converges. 





EXAMPLE 13.33 
Discuss convergence for the series 


5 sin? (2j + 5) 


-= 13.11 
jt+8i +6 ( ) 


j=l 
Solution: Observe that 


sin? (2j + 5) 1 1 
<- - <- - < 
jt+ 8i +67 j4t+8 +67 jt 





Also 


< 1 
Be 


J 
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is aconvergent p-series. By the comparison test for convergence, the original series 
(13.11) converges. 














You Try It: Discuss convergence for the series )> f jI? + j). 


EXAMPLE 13.34 
Does the series 





ee 1 
D4 G+ MG TD 


converge? 


Solution: This series has nonnegative summands and is termwise smaller than 
the series 


| = 


oo 
j=l 


However, this last series diverges. So the comparison test for convergence does not 
tell us anything about the series 


`< 





~ 1 
A GD 


j=l 


Note that we are not saying that the series diverges. Rather, we can draw no con- 
clusion. 














Insight: In fact the Cauchy condensation test may be applied directly to the series 
in the last example. First rewrite the series as 


z 1 
D Er 


EES 





Now apply Cauchy to obtain 


oo l 1 
Y. 
L 2i) 
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This simplifies to 


1 OO 
ind 7 


œ. | = 


Of course this is a constant multiple of the harmonic series, and it diverges. We 
conclude that the original series diverges as well. 


Insight: Because the convergence or divergence properties of a series do not 
depend on the first several terms, we could have stated the comparison test for 
convergence by requiring that 0 < c; < a; for all sufficiently large j. Let us look 
at an example to see how this might work in practice. 


EXAMPLE 13.35 
Determine whether the series 





% 1 
D eE 


converges. 
Solution: We notice that for j > 20 we have 


p-4pP-6j-5=j-p-4j7°-6j-5 
> 20j7 -4j? -6j -5 
=4j? 4+ (5j? — 4j?) + 6)? — 6j) + 6j? —5) 


Now each of the terms in parentheses is positive for j > 20. Therefore the last line 
exceeds 4j? when j > 20. We conclude that, for large j, 


1 ine 
—4j2?-6j7 —5 ~ 47? 





Because 


2 1 
dap 


j=l 


13.12 
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converges, it follows from the comparison test for convergence that the series 





a 1 
2e 6-5 


J 





converges. 











A Test for Divergence 


We can now reverse our reasoning to obtain a comparison test for divergence. 
Namely, suppose that 0 < c; < a; and that the series 


diverges. Then the series 


Il 
a 


would have to diverge also. For if the latter series converged then the comparison 
test for convergence would imply that 


OO 
D4 
j=l 


converges, and that would be a contradiction. We summarize: 


Theorem 13.6 (The comparison test for divergence) Let 0 < cj < a; for every j. 
If the series Xi cj diverges then the series D aj also diverges. 


We note that Theorem 13.6 is, in effect, a restatement of Theorem 13.5 (in fact 
it is the contrapositive). But we display it as a separate theorem because of its 
importance. 
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EXAMPLE 13.36 
Show that the series 


X In(j +4 
SN (13.12) 
j=l 
diverges. 
Solution: Observe that 
In(j +4 z 1 
J J 


for every j > 1 and the series 


diverges. By the comparison test for divergence, we conclude that the series in 
Eq. (13.12) diverges as well. 


ENE 














EXAMPLE 13.37 
Analyze the series 





(13.13) 





= F2. Ss 


Solution: We notice that 


1 1 
fifo /MG FD = G+2)-InG +2) 








for each j. In Example 13.34 above we already noted that the series 





a (j +2)- TF + 2) 


j=l 


diverges. By the comparison theorem for divergence, the series in Eq. (13.13) 
diverges. 














You Try It: Discuss convergence or divergence for the series Sr iG? + j). 
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EXAMPLE 13.38 
Use the comparison theorem for divergence to study the series 
Sear 
eee ha 
Solution: Observe that 
1 1 





[je + 3] 1/2 = [4 j 1/2]1/⁄2 


The series 





z 1 imi 
2 [4 j 1/2112 = 2. D ji 


j=1 j=1 
is a divergent p-series hence the series 


5. 1 


m eb 


also diverges. 


EXAMPLE 13.39 
Does the series 


converge? 














(13.14) 


Solution: This series of nonnegative summands termwise exceeds the series 





But the series 
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converges because it is a p-series. Thus the comparison test for divergence does 
not tell us anything about the convergence or divergence of the series 





youri Gj +2) 
G +2)? 











j=2 


Insight: Examples 13.38 and 13.39 remind us that the comparison tests are not 
“if and only if” statements. If the comparison test for convergence works, then 
the series in question converges. If the test fails, we know nothing. Likewise, if 
the comparison test for divergence works, then the series in question diverges. 
Otherwise we know nothing. 

Also bear in mind that the comparison tests are valid only for series with 
nonnegative terms. 


13.13 The Ratio Test 


Theorem 13.7 (The Ratio Test) Let 


be a series. Suppose that 








Then 


1. If L < 1 then the series converges absolutely. 
2. If L > 1 then the series diverges. 
3. If L = 1 then the test gives no information. 


Insight: Take particular notice that the theorem says that when L = 1 then the 
test yields no information. This point will be made clearer in the examples. Also 
remember that the limit of the expression |c ;+1/c ;| might not even exist. In this case 
the test also yields no information. 

Finally notice that the test only depends on the limit of the ratio |c;41/c;|. The 
outcome does not depend on the first million or so terms of the series, and that is 


292 Discrete Mathematics Demystified 


the way it should be: the convergence or divergence of a series depends only on its 
“tail.” 


Now we look at some examples that illustrate when the ratio test gives conver- 
gence, when it gives divergence, and when it gives no information. 


EXAMPLE 13.40 
Apply the ratio test to the series 


ae 


Solution: Recall that j!= j-(j —1)-(j —2)---3-2-1. Observe that c; = 
2//j! Then 


_ G+! 
= 2j! 


Cj+1 
Cj 











As j — œ, the limit of this expression is 0. Therefore in this example the limit L 
exists and equals 0. By Part (1) of the ratio test, we may conclude that the series 
converges absolutely. 














EXAMPLE 13.41 
Discuss the convergence or divergence of the series 


gr 
jal 
Solution: In this example we have c; = j 10/97. Therefore 


B (j Ea 1)!9/2/+! 
E joz 


o fi+1\® 1 
= 5 


Cj+1 
Cj 
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The limit of this last expression exists and equals 1/2. Therefore in this exam- 
ple L = 1/2 < 1. Part (1) of the ratio test then tells us that the series converges 
absolutely. 














You Try It: Discuss convergence or divergence for the series par P+. 


EXAMPLE 13.42 
Analyze the series 





Solution: We have 








_ Gf2)! 
J pp 
Therefore 
gal CDHG + 13? 
Cj (3/2)4/j3?? 








E j 32 3 
TFN 32 


Now, as j — ov, the last expression tends to the limit L = 3/2. Because L > 1, 
the ratio test says that the series diverges. 














EXAMPLE 13.43 
Apply the ratio test to the series 


3 
jal J 
Solution: We have c; = 1/j hence 


_WG+D 
Wi 
j 


ji 


Cj+1 
Cj 











13.14 
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As j — œ, the last expression tends to 1. Therefore the ratio test gives no in- 
formation whatever. Of course this is the harmonic series, and we know that it 
diverges. 














EXAMPLE 13.44 
Apply the ratio test to the series 


1 
J’ 


2 


OO 
j=l 
Solution: Observe that c; = 1/j°. Therefore 


Cj+1 
Cj 


VG+ PP 
© MP UGG ED 











as j — oo. Therefore the ratio test gives no information. Of course this is a p-series, 
and we know that it converges. 














You Try It: Discuss convergence or divergence for the series ` j 34/(2/ + j). 


Insight: Examples 13.43 and 13.44 together explain what we mean when we 
say that the ratio test gives no information when the limit L is equal to 1. Under 
these circumstances, the series could diverge (Example 13.43) or the series 
could converge (Example 13.44). There is no way to tell: you must use another 
convergence test to find out. 


The Root Test 


The root test has a similar flavor to the ratio test. Sometimes it is easier to apply the 
ratio test than it is to apply the root test or vice versa. Thus, even though the tests 
look rather similar, you should be well-versed at using both of them. 


Theorem 13.8 (The root test] Let 
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be a series. Suppose that 
lim |c; =L 
j>% ` 

Then 


(1) If L < 1 then the series converges absolutely. 
(2) If L > 1 then the series diverges. 
(3) If L = 1 then the test gives no information. 


Insight: Once again take particular note that when L = 1 then the root test gives 
no information whatever. Also, the limit L might not even exist. In this case, the 
test also gives no information. Finally, the root test does not depend on the first 
million or so terms of the series—only on the “tail.” 


Now we look at some examples that will illustrate when the root test gives 
convergence, when it gives divergence, and when it gives no information. 


EXAMPLE 13.45 
Analyze the series 
CO . 
J 
a HO 


Solution: We apply the root test. Notice that c; = j/(j 2 + 6)/ so that 


iA j 
i 
leuk? p46 (13.15) 
Certainly for any j > 1 we have that 
ee ae 





<= 
jo 6 ye y 


It follows that the expression (13.15) tends to 0 as j — oo. Therefore L = 0 < 1 
and the root test tells us that the series converges. 














Insight: Notice that 
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provided that j > 2. Thus the series from Example 13.45 is termwise dominated 
by the series 


1 
43 


2 


j=l J 


Thus convergence can also be obtained from the comparison test. The root test, 
however, proves to be more straightforward to apply in this example. 


EXAMPLE 13.46 
Apply the root test to the series 


Gen 1 
> Un +D] 


j=l 


Solution: Observe that c; = 1/[In(j + 2)]/. Thus 


1 

lei = —— 
In(j + 2) 

This expression tends to 0 as j —> oo. Therefore L = 0 < 1 and the Root test says 

that the series converges absolutely. (Try using the ratio test on this one! How about 

the Comparison Test?) 














You Try It: Discuss convergence or divergence for the series } > j 2//(4/ + 3/). 


EXAMPLE 13.47 
Apply the root test to the series 


2 Fi 
j=l 


J 


Solution: We have c; = 2//j 10. Therefore 





| 2 
Ic; = (13.16) 


Since pi — las j —> œ, the expression (13.16) tends to 2 = L > 1. We con- 
clude, using the root test, that the series diverges. 
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EXAMPLE 13.48 
Test the series 


(77? + 1)! 
a 25 +271 
for convergence or divergence. 
Solution: We see that 
-OPIY 


ST QF +24 
As a result, 
+2 
Vee gerne eee 
' (2j +2)? 


Tp +1 7 


= — - —> ->| 
4j? +8j +4 4 














By the root test, the series diverges. 





EXAMPLE 13.49 
Analyze the series 


using the root test. 
Solution: We have c; = j’. Thus 


elit =g 
This last expression tends to L = 1 as j —> œœ. Therefore the root test gives no 
conclusion. On the other hand, the terms of the series do not tend to zero hence the 


series fails the zero test. Thus the series diverges. 
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EXAMPLE 13.50 
Does the series 


converge? 
Solution: We see that c; = 1/[j - (In j)?]. Therefore 


1/j ok 1 
PM -in pip 





le; 
As j — œ, we know that 
pli > 1 
Also 
1<dn jy T 


for j > 3 hence we may conclude that our sequence is trapped between two se- 


quences that we have already studied. In particular, (In j yl J. 
In conclusion, 


AKA —>l=L 


Therefore the root test gives absolutely no information. Check for yourself that the 
Cauchy condensation test gives convergence. 














Insight: Examples 13.49 and 13.50 explain what we mean when we say that the 
root test gives no information when L = 1. Under these circumstances the series 
could diverge (Example 13.49) or the series could converge (Example 13.50). The 
only way to find out is to use another test. 


Exercises 


In Exercises 1 and 2, write out the first ten partial sums of the series. 


1. Deo 3 
2 Baie 
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In each of Exercises 3 and 4, use a calculator to determine some partial sums and 
guess whether the series converges. If you can (for the cases of convergence), state 
what the sum should be. 


3. EE 
Ae IG aaa) 


Each of the series in Exercises 5 and 6 converges. Explain why. Can you find the 
sum? 


5. Line BGS 
6. D 2-/ cos( jr) 
In Exercises 7 and 8, find the explicit sum of the series. 
LS 
8. Gi)! 


In Exercises 9 and 10, calculate each of the sums by using the formula for the partial 
sum of a geometric series which appears in the text. 


onga 
S a 


In Exercises 11 and 12, use the comparison tests to determine whether the series 
converges or diverges. 


Pe 
œo sin’ j 
1. 12) $ 


12. 2% i PH 


In Exercise 13 , test the given series for convergence or divergence by using the 
Ratio Test. If the test gives no information, then say so explicitly. 


13) et 


In Exercise 14, test the given series for convergence or divergence by using the 
Root Test. If the test gives no information, then say so explicitly. 


3j j 
14. DE, (24) 
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Final Exam 


1. What is the contrapositive of the statement “If books are good then jobs are 


scarce”? 


(a) If jobs are scarce then books are good. 
(b) If jobs are not scarce then books are bad. 
(c) If jobs are bad then books are scarce. 

(d) If jobs are good then books are bad. 


(e) If jobs are books then scarce is bad. 


. Give an example of a statement involving the variable x that is true for all 
values of x (in some universe that you will specify). 


(a) If x is areal number then x? + 1 > 0. 
(b) If x is areal number then 3 < x < 5. 
(c) If x is a rational number then x? > 0. 
(d) If x is areal number then x is a rational number. 


(e) If x is an integer then x > 0. 
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3. Give an example of a statement involving the variable x that is false for all 


values of x (in some universe that you will specify). 


(a) If x is a real number then x* > 0. 
(b) If x is areal number then x + 3 = 7. 
(c) If x is an integer then x? = 2. 

(d) If x is a natural number then x? < 0. 


(e) If x is a complex number then x? = x. 


. What is the negation of the statement “All fish have fins”? 


(a) No fish have fins. 

(b) A few fish have fins. 

(c) Fins have fish. 

(d) There is a fish with no fins. 
(e) Lots of fish have no fins. 


. What is the negation of the statement “Either the woman is blonde or the 


man is short”? 


(a) The woman is blonde and the man is short. 
(b) The woman is not blonde and the man is tall. 
(c) The woman is tall and the man is blonde. 

(d) Everyone is blonde. 

(e) Nobody is short. 


. Give a statement that is logically equivalent to ~ B =~ A and which uses 


only V and ~. 


(a) BV~A 
(b) ~BV~A 
(c) BVA 
(d) BV~B 
(e) AV ~ A 


. Give a statement that is logically equivalent to ~ A =~ B and which uses 


only A and ~. 
(a) BA~A 

(b) ~BA~A 
(c) ~ (~ AAB) 
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10. 


11. 


12. 


(d BA~B 
(ec) AAW A 


. The denial of the statement “All red boats sail on blue water” is 


(a) Some red boats sail on blue water. 

(b) There is some red boat that sails on water that is not blue. 
(c) All red boats do not sail. 

(d) Some red boats do sail. 

(e) This red boat is for sale. 


. The method of proof by contradiction is useful for 


(a) Proving that something is false. 

(b) Proving that something is undecidable. 
(c) Proving that something is true. 

(d) Proving that something cannot be proved. 


(e) Disproving something. 
The method of mathematical induction involves 


(a) Infinitely many steps. 

(b) A contradictory step. 

(c) A bootstrap step. 

(d) An important inductive step. 


(e) An indeterminate step. 


The method of direct proof is important because 
(a) It is indirect. 

(b) It is valid. 

(c) It is invalid. 

(d) It applies to most proofs. 

(e) It is logically simple. 

If A = {1, 2,3} and B = {2, 3, 4} then 

(a) ACB 

bO BCA 

Cc) ANB=9 
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(d) AM B has two elements 
(e) AUB has three elements 
13. If A C B and B CC then 


(a) ADC 
(b) ADB 
(c) ANB=C 
(dìd ANC=B 
(ec) ACC 
14. If A has 2 elements and B has 3 elements and C has 4 elements, then 
A x B x C has 


(a) 4 element 
(b) 9 elements 
(c) 15 elements 
(d) 24 elements 
(e) 6 elements 
15. If x € A and A € B then 
(a) x may not be an element of B 
(b)xeB 
(c) xAB =Ø 
(d) x UB=A 
(e) xNA=B 
16. The set “(A N B) is the same as 
(a) ANB 
(b) AUB 
(c) “AUSB 
(d) ACB 
(ec) BCA 


17. If A has 5 elements and B has 4 elements then 


(a) AN B has 3 elements. 

(b) AUB has 6 elements. 

(c) A x B has 10 elements. 

(d) AN B has at most 4 elements. 
(e) A\B has at most 2 elements. 
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18. 


19. 


20. 


21. 


22. 


Let S be the collection of all sets with at most 5 elements. Then 


(a) An element of A is a set with 1, 2, 3, 4, or 5 elements. 

(b) An element of $ is a number. 

(c) An element of S is a set with 25 elements. 

(d) An element of S is a set with an arbitrary number of elements. 

(e) An element of S is a superset of S. 

If f is a function with domain S and values in R and g is a function with 

domain S and values in R then 

(a) f — g isa function with domain S and values in C. 

(b) f - g is undefined. 

(c) All the arithmetic operations make sense on f and g (provided we do 
not divide by 0). 

(d) There are no functions between f and g. 


(e) f is smaller than g. 


Say that two real numbers a and b are related if the sum of their squares is 
100. Then this relation is 

(a) A function. 

(b) One with finitely many elements. 

(c) An equivalence relation. 

(d) Symmetric in its entries. 

(e) Increasing. 

Say that two functions f and g, with domain R, are related if f(x) < g(x) 
for every x € R. Then 

(a) This is an equivalence relation. 

(b) This is an order relation. 

(c) This is a function. 

(d) This is a total ordering. 

(e) This does not allow us to compare any two functions. 

If f is a natural-number-valued function with domain Q and g is a natural- 
number-valued function with domain Q then h(x) = f (x)&“ is 

(a) A function with domain Q and image C. 

(b) A function with values in N. 


(c) A function with upper and lower bounds. 
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24. 


25. 


26. 


27. 
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(d) A function with limited values. 

(e) A function with bounded socles. 

A function f (x) with domain and range R is said to be the square root of the 
function g with domain and range R if f7(x) = g(x) for every x. If f and g 
have this relationship then 

(a) The function f is greater than the function g. 

(b) The function g takes nonnegative values. 

(c) The function f is increasing. 

(d) The function g is differentiable. 

(e) The function f is exceptional. 

If f(x) = x? and g(x) = sin x then 

(a) f o g is unbounded. 

(b) go f is unbounded. 

(c) f og and go f are both well defined. 

(d) f - g is increasing. 

(e) f — g is monotone. 

Say that two universities are related if they sponsor NCAA sports and if they 
play against each other regularly. Then 

(a) This is a symmetric relation. 

(b) This is a doomed relation. 

(c) This is a function. 

(d) This is an equivalence relation. 


(e) This is a perfect relation. 
The union of two functions is 


(a) Always an equivalence relation. 
(b) Never an equivalence relation. 
(c) Another function. 

(d) A one-to-one function. 


(e) An onto function. 
Between every two distinct rational numbers there is 


(a) A whole number or integer. 
(b) A multiple of five. 
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28. 


29. 


30. 


31. 


32. 


(c) An irrational number. 
(d) A quintessential number. 


(e) A complex number. 
Addition of integers is 


(a) Commutative and associative. 
(b) Distributive. 
(c) Disruptive. 
(d) Meaningful. 


(e) Anticommutative. 
The familiar number systems which form a field are 


(a) The integers and the natural numbers. 

(b) Only the rational numbers. 

(c) Only the real numbers. 

(d) The rational numbers, the real numbers, and the complex numbers. 


(e) Only the integers. 
Every nonzero complex number has 


(a) A square root. 

(b) A cube root. 

(c) Three square roots. 

(d) No real roots. 

(e) Two distinct square roots. 
Both square roots of a nonzero, positive real number will be 
(a) Real. 

(b) Positive. 

(c) Negative. 

(d) Complex. 

(e) Incommensurable. 


Unlike the rational numbers, the real number system is 


(a) Unbounded. 
(b) Complete. 
(c) Monotone. 
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34. 


35. 


36. 


37. 
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(d) Decreasing. 


(e) Decentralized. 
The process of mathematical induction is closely related to 


(a) The rational number system. 
(b) The integers. 

(c) The natural numbers. 

(d) The real numbers. 


(e) The quaternions. 
The complex numbers improve or augment the reals in that 


(a) They are more complex. 

(b) Every polynomial has a root. 

(c) They are two-dimensional rather than one-dimensional. 
(d) They have Argand diagrams. 


(e) They are commutative under addition. 
The quaternions do not form a field because 


(a) They are four-dimensional. 

(b) They have i, j, and k as elements. 

(c) They are commutative under addition. 

(d) They are not commutative under multiplication. 


(e) They are more complex than the complex numbers. 
The modulus of a complex number measures 


(a) Its distance from the origin. 

(b) Its distance from the real axis. 

(c) Its distance from the imaginary axis. 
(d) Its size and shape. 


(e) Its magnitude. 
The triangle inequality is 


(a) A special property of the real number system. 
(b) A special property of the complex number system. 


(c) A special property of the rational number system. 
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38. 


39. 


40. 


41. 


42. 


(d) A useful device for measuring distance in a number system. 


(e) A fact about triangles. 
The rational number system is 


(a) Closed under limits. 

(b) A very small number system. 

(c) Not closed under square roots. 

(d) Almost as big as the real number system. 


(e) Much smaller than the complex number system. 
The existence of negative integers is 


(a) An artifice that we concoct to make life interesting. 
(b) A property that we hope is true. 

(c) A special feature of that number system. 

(d) A byproduct of the way that we construct the integers. 
(e) Very confusing. 

In everyday commerce and in most practical activities the number systems 
that are most commonly used are 

(a) The quaternions. 

(b) The integers and the rational numbers. 

(c) The natural numbers. 

(d) The complex numbers. 


(e) The algebraic integers. 
The number of different ways to select 3 cards from a pack of 10 is 


(a) 120 
(b) 100 
(c) 50 
(d) 96 
(e) 84 
The number of distinct permuations of 6 objects is 
(a) 500 
(b) 720 
(c) 650 
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44. 


45. 


46. 


47. 
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(d) 401 
(e) 405 
Fifty black balls and 50 white balls are dropped at random into 60 cups. We 


can be sure that 

(a) Some cup contains both a black ball and a white ball. 
(b) Some cup contains 2 black balls. 

(c) Some cup contains 2 white balls. 

(d) Some cup contains 2 balls. 

(e) Some cup contains 3 balls. 

The expression [x + a]* can be expanded to 

(a) x? + 2ax +a? 

(b) x? + 3a?x + 3ax? + a? 

(c) x4 + 4x3a + 6x7a? + 4xa3 + at 





(d) xt + at 

(e) xt -at 

The recursion relation dj = 1, aj = 2, aj = 2a;_| — a j—2 has the solution 
(a) aj = j? +4 

(b) aj = 37 —5 

© aj=j}’ -j 

(d) aj= j 

(e)aj=1 


The probability of getting a straight (5 cards in sequence) from an ordinary 
52-card deck of playing cards is 


(a) 0.000189 
(b) 0.00111 
(c) 0.0123 
(d) 0.5432 
(e) 0.0000765 


The probability of picking 2 cards at random from a standard 52-card deck 
and having them both be of the same denomination (in other words, the 
probability of having a pair) is 

(a) 0.0588 

(b) 0.0033 
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49. 


(c) 0.00004 
(d) 0.0294 
(e) 0.1111 


You have a deck of five distinct cards, numbered 1, 2, 3, 4, and 5. You close 
your eyes and pick them one by one—in some random order. What is the 
probability that you chose them in the order 1-2—-3—4-5? 


(a) 0.1234 
(b) 0.0044 

(c) 0.00833 
(d) 0.001101 
(e) 0.11223 


The sum of the matrices 


(d) 


11 

1 a 
25! 8 

©) be =) 


50. The matrix product 
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equals 


(a) k 2) 
TE 
o( 3) 
(a) @ 0) 
0 (4 2) 
51. The inverse of the matrix i i) is 
(a) G i3 
(b) G 3 
©) G 2) 
(d) & à 
() a 0) 


52. The solution of the system 


x—2y+z=4 
2x+y-z=1 
—x+y+z=3 


is 
(a) x =2,y=4,7=3 
(b) x=2,y=1,z=4 
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53. 


54. 


55. 


56. 


57. 


(c) x=5,y=3,z=1 

(d x=1,y=1,z=0 

(e) x = —3, y = —1,z= 1 

When we solve four equations in three unknowns we generically expect 


(a) No solutions. 

(b) One solution. 

(c) Three solutions. 

(d) Infinitely many solutions. 

(e) One solution for each variable. 

Find the extrema of the linear function f(x, y) = 3x — 4y on the planar 
region {((x,y):4<x+y<7,1<x <3}. 

(a) The minimum value is —8 and the maximum value is 5. 
(b) The minimum value is —2 and the maximum value is 6. 
(c) The minimum value is —21 and the maximum value is 5. 
(d) The minimum value is —2 and the maximum value is 7. 


(e) The minimum value is —1 and the maximum value is 1. 
The reason that the kth row of Pascal’s triangle sums to 2° is that 


(a) The number 2* is even. 

(b) There are 2% rows in Pascal’s triangle. 
(c) Power of 2 are powerful. 

(d) The entries are the binomial coefficients. 


(e) There is hidden symmetry. 
A graph without an Euler path is 


(a) Very simple. 

(b) Very complicated. 
(c) Very reducible. 
(d) Redundant. 

(e) Superfluous. 


A complete graph on 6 vertices has 


(a) 12 edges. 
(b) 10 edges. 
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(c) 15 edges. 
(d) 8 edges. 
(e) 4 edges. 
58. The number of different graphs on 3 vertices is 
(a) 4 
(b) 10 
(c) Infinite 
(d) Few 
(e) Many 
59. A graph with four vertices and two edges 


(a) Might be connected or might not. 
(b) Must be connected. 

(c) Cannot have any Euler paths. 

(d) Must have Euler paths. 

(e) Must be reducible. 





60. The Euler characteristic of a sphere with g handles is 
(a) 3g 
(b) 2— 2g 
(c) 2+ 2g 
(d) g 
(e) 4+g 
61. A traveling salesman must visit three cities—each just once. There are paths 


connecting every city to every other city. He/she will begin at a particular 
city A. How many different routes could he/she take? 


(a) 5 
(b) 3 
(c) 2 
(d) 4 
(e) 1 
62. Answer the question in Prob. 61 for four cities. 
(a) 2 
(b) 4 
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63. 


64. 


65. 


66. 


67. 


(c) 3 

(d) 5 

(e) 1 

Simplify the expression 3 + 4 mod 5. 
(a) 1 

(b) 2 

(c) 3 

(d) 4 

(e) 5 

The expression 8/5 mod 11 simplifies to 
(a) 6 

(b) 5 

(c) 4 

(d) 3 

(e) 2 

Simplify the expression 6 - 7 mod 9. 
(a) 1 

(b) 2 

(c) 4 

(d) 5 

(e) 6 


The identity element in a group is 


(a) One of a finite set. 

(b) Part of the reproducing set. 
(c) Unique. 

(d) Self-reproducing. 

(e) Permanent. 


If g is an element of the group G then its multiplicative inverse element is 


(a) Unique. 
(b) Self-replicating. 


(c) Permanent. 
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(d) Multiplicative. 

(e) Subjunctive. 

The group Z/4Z has how many subgroups? 

(a) 5 

(b) 3 

(c) 4 

(d) 1 

(e) 2 

How many Sylow subgroups does Z/4Z have? 

(a) 1 

(b) 3 

(c) 4 

(d) 2 

(e) 5 

Can Z/3Z be realized as a subgroup of Z/9Z (that is, is there a subgroup that 
is equivalent or isomorphic to it?)? 

(a) Yes. 

(b) No. 

(c) Sometimes. 

(d) Usually. 

(e) It is forbidden. 

Let G be a group and g, h € G. Then (g7h)~! equals 
(a) h’g 

(b) h'g! 

(c) h'g? 

(d) AIh? 

(e) gh! 

Let G be a group and H a subgroup. Define G/H to be the set of equivalence 
classes of G under the relation g ~ h if g~'h € H. Describe G/H in case 
G = Zand H = 3Z. 

(a) Z/2Z 

(b) Z/3Z 
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74. 


75. 


76. 


(c) Z/4Z 

(d) R 

(e) Q 

Call H a normal subgroup of the group G if g~'hg € H whenever g € G 
and h € H. Is the set of even integers a normal subgroup of Z? 

(a) No. 

(b) Not usually. 

(c) Sometimes. 

(d) Yes. 

(e) Disallowed. 

Refer to Prob. 73 for the definition of normal subgroup. Is The set of di- 
agonal matrices a subgroup of the set of all 2 x 2 invertible matrices under 
multiplication? 

(a) Sometimes. 

(b) Most times. 

(c) Seldom times. 

(d) No. 

(e) Yes. 

The encryption of the message COUNT TO FOUR under the affine encryp- 
tion P +» 4P — 2 is 

(a) HPQRSUTMNOM 

(b) GEAAYYESEAQ 

(c) ORMKBOESPKS 

(d) VFUODBNEUKD 

(e) XBGKDUNGEUM 

The encryption of the message USE YOUR HEAD under the encryption 
Pt P?+Pis 

(a) ZMFNETHELSEL 

(b) FLEJBIPALBME 

(c) VDEUBGOIKSUE 

(d) DDUECEUEUAUQ 

(e) FELDJBOIWJBE 
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The encrypted message KFLMUDU can be decrypted with the mapping 
P + 3P +5. The message reads 

(a) JUMP NOW 

(b) GET HOME 

(c) STAY PUT 

(d) BACK OFF 

(e) GET GOOD 


The digraph method breaks a message up into 


(a) Units of 3 characters. 
(b) Units of 10 characters. 
(c) Units of 2 characters. 
(d) Units of 4 characters. 


(e) Units of 7 characters. 
When using the digraph method, we do arithmetic modulo 


(a) 676 
(b) 625 
(c) 601 
(d) 699 
(e) 666 


Encryption and decryption are 


(a) Complementary processes. 
(b) Evil processes. 

(c) Secret processes. 

(d) Inverse processes. 


(e) Subtle processes. 
Cryptography is an old idea, going back even to 


(a) Hannibal 

(b) Attila the Hun 

(c) William the Conqueror 
(d) Julius Caesar 

(e) Abraham Lincoln 


Fin 


83. 


84. 


85. 


86. 


al Exam 319 


. The boolean expression [a x b] + [a x b] + [a x b] simplifies to 


(a) axb 
(b) axb 
(©) a+b 
(d) a+b 
(e) a x b 
Boolean algebra is useful in 
(a) Square dancing. 
(b) Circuit design. 
(c) Mosaic tiling. 
(d) Fly fishing. 
(e) Acrobatics. 
How many axioms does boolean algebra have? 
(a) Two 
(b) Three 
(c) Five 
(d) Seven 
(e) Nine 
;2 
JEF 


(a) Converges. 





The sequence 


(b) Diverges. 

(c) Oscillates. 

(d) Fiddles around. 
(e) Dies. 

The sequence j - sin j 
(a) Converges. 

(b) Diverges. 

(c) Perpetrates. 

(d) Disintegrates. 

(e) Propagates. 
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89. 


90. 
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. The sequence j 


(a) Subverges. 
(b) Converges. 
(c) Diverges to ov. 
(d) Disverges. 


(e) Postverges. 
The sum of two sequences is 


(a) A series. 

(b) A product. 

(c) A panoply. 

(d) Another sequence. 


(e) A group. 


If {a;} is a sequence of nonvanishing terms that converges to some nonzero 


1 
number £, then the sequence — 
a; 


1 
(a) Converges to r 
(b) Diverges. 
(c) Disbands. 
(d) Converges to L. 
(e) Converges to 3£. 


The sequence a; = (—2)/ 


(a) Converges. 
(b) Diverges. 
(c) Mutates. 
(d) Obfuscates. 
(e) Rotates. 

j? 
The sequence aI 
(a) Increases. 
(b) Diverges. 


(c) Converges. 
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92. 


93. 


94. 


95. 


96. 


(d) Displays. 
(e) Radiates. 


The pinching theorem is a device for 


(a) Proving the convergence of a sequence. 
(b) Proving the divergence of a sequence. 
(c) Recognizing a sequence. 

(d) Discarding a sequence. 

(e) Hiding a sequence. 


j? 
The series — 


—~ 2j 
j 
(a) Diverges. 
(b) Disbands. 
(c) Refutes. 
(d) Converges. 
(e) Dissipates. 


1 J 
The series ) (5) 
: J 
j 


(a) Converges. 
(b) Diverges. 
(c) Subverts. 
(d) Reverts. 
(e) Exerts. 
The series = j-sinj 
j 
(a) Converges. 
(b) Diverges. 
(c) Implodes. 
(d) Divests. 


(e) Soars. 


1 J 
The series ye (1 + -) 


; J 
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(a) Redounds. 
(b) Exerts. 

(c) Restarts. 
(d) Converges. 
(e) Diverges. 


1 J 
97. The series ss (=) 
; J 
j 


(a) Projects. 
(b) Converges. 
(c) Rejects. 
(d) Diverges. 
(e) Subjects. 
98. The sum of two series 


(a) Is one spicy meatball. 
(b) Is opinionated. 
(c) Is another series. 
(d) Is eternal. 
(e) Is ephemeral. 
99. The purpose of series is to provide 

(a) A generalization of ordinary addition. 
(b) A good use of time. 
(c) A good use of money. 
(d) A diversion. 
(e) An engagement. 

100. TRUE OR FALSE: If a; > 0 and `; aj converges then >; a? converges. 
(a) True 
(b) False 
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Solutions to Exercises 


This book has a great many exercises. For some we provide sketches of solutions 
and for others we provide just the answers. For some, where there is repetition, we 
provide no answer. For the sake of mastery, we encourage the reader to write out 
complete solutions to all problems. 


Chapter 1 


T| ~GSvT)| GSAT)v~ SvT) 





S S 
T F 
l.(a) T F 
F F 
F T 


= ey a] > 
=e l< 
= 
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S|T|SvT|SAT| (SvT)> GAT) 
T/T T T T 
b) TIF] T F F 
F | T T F F 
F | F F F T 
2.(a) SS~T 


b) U> (Vv ~S) 
3. (a) If either all politicians are honest or no men are fools then I do not have 
two brain cells to rub together. 


(b) Either the pie is in the sky or both some men are fools and I do have two 
brain cells to rub together. 


4. (a) Converse: If there are clouds, then it will rain. 
Contrapositive: If there are no clouds, then it will not rain. 


(b) Converse: If it is raining, then there are clouds. 
Contrapositive: If it is not raining, then there are no clouds. 
5. (a) False. The area inside a circle is mr. 
(b) True. We note that 2 + 2 = 4 is true and 2/5 is also a rational number. 
6. (a) ~SV~T 
(b) ~ (Sv T) 
7. (a) The set $ contains at most one integer. 


(b) Either some mare does not eat oats or some doe does not eat oats. 





A|B| ~A| ~B| Av~B| ~ASB 
T/T] F |F T T 
8.(a) T|/F| F | T T T 
F\/T| T |F F T 
F\F| T | T T F 


? 
> 
? 


B| AA~B| ~AS>~B 





A 
T 
b) T 
F 
F 


Sens 
Saas 
= 
4344 


The two statements are logically inequivalent. 
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Chapter 2 


1. If m=2r+1 and n=2s+1 then m -n = (2r4+1)-Qn4+1) =4rs+ 
2r + 2s +1 = 2(2rs +r +s) +1, which is odd. 


2. Ifn = 2r then m -n =m- (2r) = 2(m - n), which is even. 
3. By induction: 
The case n = 1 is true because 
ise ae a ee 
- =l= 





Now assume that the case n has been established, so 


2n? + 3n? 
aor pa at 


6 
Add (n + 1) to both sides to obtain 
2 2 2 2 2n? + 3n? +n 2 
ee a i a 


We may write the right-hand side as 








2n? + 3n? +n ean ee ere 
6 7 6 
o Unt +3412 ++) 
= 6 


That completes the inductive step. 
4. Ifm = 3‘ andn = 3! and k < £ then 


m+n = 3k +36 = 341 +35" 


Clearly 1 + 3°~* is not a power of 3. 


5. Say that n = (a/b)*, where a and b have no common divisors. Then 
nb’ =a 


Now each prime divisor of a must divide the left-hand side. But it cannot 
divide b, so it must divide n. And in fact it must do so twice (since a is 
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squared on the right). So n = a?r for some integer r. Thus 
(a°r)b* =a? 
Dividing out a? gives 
rb? =1 


We conclude that r = 1 and b = 1 (since all numbers here are integers). But 
then a/b is a whole number, not a fraction. 


. Ifn +1 = b? and n = a? then 


1=(n+1)-n =b — a = (b — a)(b + a) 
Since all numbers are integers we must conclude that 
b+a=1 
b-a=1 


Thus a = 0 and b = 1. That is a contradiction because 0 is not a natural 
number. 


. The inductive step is not well defined. It is not possible to write down an 


inductive statement that is valid for the entire argument. 


. The case k = 3 is trivial since 2* > 1 + 2 - 3. Assuming that we have estab- 


lished the case k, we have 
2 > 1+2k 
Muliplying both sides by 2 we find that 
2! 5 2+ 4k 
= 2(k + 1) + 2k 
>2(k+1)+1 


That is the inductive step. 


. The pigeonhole principle is clear for 1 mailbox and 2 letters. 


Suppose it has been established for k mailboxes. We now have k + 1 mail- 
boxes and k + 2 letters. If all the letters are placed in just k mailboxes then 
the result is clear by the inductive step. So suppose instead that every box 
contains at least one letter. If the first k boxes each have precisely one letter, 
then the last box has two and we are done. If instead one of the first k boxes 
has two letters then we are done. That completes the inductive step. 
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10. If one letter is in the wrong envelope then two letters are in the wrong 
envelope. So the probability is 0. 


Chapter 3 


l.(a) SONU = {1,2,3,4} 
(b) (SOT)UU = {1, 2, 3, 4, 5, 9} 
2, SxT=@6 


3. (a) Ifx € SA (T UU )thenx € Sandx € T UU. Thus~x € S and either x € 
Torx €U.Sox ESOT oxesSNU.Thusxe(SAT)U(SNAU). 
SoSNA(TUU)C(SAT)USNU). 

Ifnowx e€ (SAT)UC(SAU) thenx € SOT andx € X NU. So ei- 
ther x is in both S and T or x is in both S and U. Thus x € S and either 
x €T or x €U. In conclusion, x € S$ and x € T UU. We see then that 
xESN(TUYU).So(SNT)U(SNU)CSO(TUU). 
(b) Similar. 


Figure 3.ls SOA (TUU)=(SNT)U(SNU),. 
4. (a) 


Figure 3.28 SU(TNU)=(SUT)N(SUU). 
(b) 
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We see that A\B = Ø, A\C = Ø, and AUB =B. 


We see that Q\ Z equals all rational numbers, expressed in lowest terms, with 
denominator unequal to 1. 


. The set Q x R is the set of ordered pairs such that the first entry is a rational 


number and the second entry is a real number. The set Q x Z is the set of 
ordered pairs such that the first entry is a rational number and the second 
entry is an integer. 


The set (Q x R)\(Z x Q) is the set of ordered pairs such that the first entry 
is a rational number and the second entry is a real number, and further so that 
not both the first entry is an integer and the second entry is rational. 


The power set is 


(9, {a}, {b}, {1}, {2}, fa, b}, (a, 1}, fa, 2}, {b, 1}, {b, 2}, {1,2}, 
{a, b, 1}, {a, b, 2}, {a, 1, 2}, {b, 1, 2}, {a, b, 1, 2}} 


10. (a) False 


(b) True 
(c) False 


11. (a) The power set is 


{D, {1}, {Ø}, {{a, b}}, {1, Ø}, {1, {a, bY}, (, {a, b}}, (1, Ø, {a, b}}} 


(b) The power set is 


{9, {e}, {A}, {3}, fo, A}, fo, 3}, {A, 3}, fe, A, a}} 


Chapter 4 


1. Reflexive: If n € Z, then n + n = 2n is even so (n,n) E R. 


Symmetric: If (m,n) € R, then m +n is even so n +m is even hence 
(n,m) ER. 

Transitive: If (m,n) € R and (n, p) € R, then m+n = 2r is even and 
n + p = 2s is even. Thus 


m+n+n+p=2r+4+2s 


Solutions to Exercises 331 


and therefore 
m+p=2r+2s—2n =2(r+s—n) 


We conclude that m + p is even so that (m, p) E R. 
The equivalence classes are the set of even integers and the set of odd integers. 


2. Reflexive: If (m,n) € Z x (Z\{0}), then m-n=m-n so that (m,n) 
Rim, n). 
Symmetric: If (m, n)R(m', n’) then m - n’ = m -n so that m’-n=m-n’. 
Hence (m’, n')R(m, n). 
Transitive: If (mn, )R(m', n’) and (m', n')R(m", n”), then 


m-n =m-n and m.n” =m -n 
Hence 
m-n' -m -n" =m -n-m"-n 


Cancelling m’ - n’ from both sides, we find that 


Hence (m, n)R(m",n"). 

The equivalence classes are ordered pairs (m, n) such that the ratio of m 
to n represent the same fraction. For instance, (1, 2), (3, 6), and (10, 20) are 
in the same equivalence class. 

3. Reflexive: If (x, y) € R’, then y = y so (x, y)R(x, y). 
Symmetric: If (x, y)R(Q’, y) then y=y’ so that y'= y hence 
x’, YR, y). 
Transitive: If (x, y)R(x’, y^) and (x’, y’)R(x”, y”) then y = y’ and y’ = y” 
so that y = y”. Hence (x, y)R(", y”). 

The equivalence classes are horizontal lines. A useful representative for 
each equivalence class is the point where the horizontal line crosses the y- 
axis. 


4. Reflexive: If a is a person then a is the same as a soaRa. 
Symmetric: If aRb then a and b are siblings (or the same person) with the 
same parents. Hence b and a are siblings (or the same person) with the same 
parents. We conclude that bRa. 
Transitive: If aRb and bRc then a and b are siblings (or the same person) 
with the same parents and b and c are siblings (or the same person) with the 
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same parents. Thus a and c are siblings (or the same person) with the same 
parents. 

The equivalence classes are sets of siblings in the same family with the 
same parents. 


. (a) Function 


(b) Not a function 


. (a) One-to-one 


(b) Not one-to-one 


. (c) 


f-g=lat-t):(.ne f, (5,0) €g} 


Parts (b) and (d) are similar. 


. (a) Domain = {x € R: x > 0} 


Image = {y E€ R: y > —3} 
(b) Domain = all people 
Image = all male parents 


9. Solution omitted. 
10. First show that c < a + b. Write c = a + b — y. Then examine a? + b? = 


(a+b-y}. 


Chapter 5 


1. 


The number system Q is closed under all four arithmetic operations provided 
that we do not divide by 0. The set R\Q is not closed under any arithmetic 
operation: 


J2—-V2=0 
(2—/2)+ (2+ V2) =4 
J2./2=2 

V2/V2 =1 


2. Let x; =qg+/2/j. 


. These sets are intervals. 


. If b is a square root of 6 then —b is also. Equivalently, the polynomial 


equations z? — 6 must have two roots. 
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Figure 5.1s An Argand diagram. 


6. First note that 1 is a cube root of this number. So z — 1 must divide the 
polynomial p(z) = z? — 1 (of which all the cube roots must be a solution). 
Dividing this polynomial p by z — 1 we find the quotient q (z) = z? + z + 1. 
Using the quadratic formula, we find the roots of q to be [-1+ J/3i 1/2. 
These are the other two cube roots of 1. 


7. Divide p by (z — @) to obtain 
p@)=4q@):@-a@)+r 
Here r is the remainder, and it must be of lower degree than the divisor 


(z — a). Hence r is a constant. Now set z = a in this last equation. The 
result is 


0=q(0)-0+r 


We conclude that r = 0. Hence (z — œ) evenly divides p. 


8. Seeking a contradiction, we suppose that 


V2+V3= 5 
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where a, b are integers. Squaring both sides gives 


a 


24+3+2V6=—% 


Simplifying yields 


We conclude that ./6 is a rational number, and that is patently false (by the 
same proof that V2 is irrational). 


Simply solve the equation 


(xi + yj+zk +w) =1+i+j 


Chapter 6 


1. 


Assuming that everyone is healthy, we may suppose that the weights of 
people in the room range from 75 pounds to 200 pounds. That is a range of 
126 possible value. And each of 300 people will have one such value. So 
there are 300 letters and 126 mailboxes. It follows that two people will have 
the same weight. 


. The waist measurements will be in the range 15 inches to 45 inches. That is 


a span of 31 values. But there are 50 people. Just as in the last problem, to 
people will have the same measurement. 


. If instead we measure waist size by millimeters, then the range will be from 


15 x 25.4 = 381 to 45 x 25.4 = 1143. That is a span of 763 possible values. 
Since there are just 50 people, each person could have a different waist 
measurement in millimeters. 


. For the choice of the first 5, there are (2) = 15504 possibilities. The other 


three will be chosen from the remaining 15 people, and then there are ( ) = 
455 possibilities. The total number of ways to assign 5 and then three people 
to the two rooms is then 15504 x 455 = 7054320. 


. Given any particular denomination of card (from two through ace), there are 


four cards of that kind. There are four different ways to choose three from 
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10. 


among those. And 13 different denominations. Hence there are 4 x 13 = 52 
different ways to form three-of-a-kind. 


. The analysis here is similar to the last problem. Given any particular denom- 


ination, there is just one way to form four of that kind. And there are 13 
different denominations. Thus there are 13 different ways to from four-of-a- 
kind. 


. The only way to roll a 7 are 1 — 6, 6 — 1, 2 — 5, 5 — 2, 4 — 3, and 3 — 4 


(where we are taking into account that there are two dice, so two ways to 
realize any particular score). And there are 6 x 6 = 36 possible outcomes 
altogether. So the likelihood is 6/36 = 1/6. 


. The only way to get a 2 is 1 — 1. Thus the chances of getting a 2 are 1/36. 


Also there is only one way to get a 12. So the chances of getting a 12 are 
1/36. Every other value can be achieved in more than one way, so these are 
the only two values for which the odds are 1/36. 


. The only ways to get a 10 are 


t=3=6 “bas 2=2=6 2=-3=5 
2-4-4 3-3-4 6-1-3 


and the six permutations of each of these. So there are 6 x 7 = 42 rolls 
that give 10. There are 6 x 6 x 6 = 216 possible rolls. Thus the odds are 
42/216 = 7/36. 


F(x) =agp tayx +aox?+---. Then xF(x) = aox +a x? + ax? t- 
and x?F (x) = aọx? + a,x? + aox* +--+. Then 


F(x) — xF (x) — 2x°F (x) = (ao + aix + ax? + a3x? +--+) 
— (aox +a x? + ax? +-->) 
= 2(aox? + ayx? + ax +--+) 
= dọ + (aı — aọ)x + (a2 — ay — 2ay)x? 
+ (a3 — a — 2ay)x? + -+ 
= ao + (a1 — ao) 


=3-8x 


336 Discrete Mathematics Demystified 


We conclude that 


FQ) = 3 — 8x = 3 — 8x 
“= gg? 9G Ae D) 








We apply the method of partial fractions to this last expression to obtain 


11/3 2/3 


F = 
a) 1+x 1—2x 








Now expanding these expressions in geometric series as usual, we have 
th n ee 
2e ONE j 
L = DCW — 3 Dx) 
j=0 j=0 
Identifying power series coefficients, we find that 


11 a ae 
a= ey SO! 
3 =) 3 


aj 


That is the solution of our recurrence relation. 
11. Draw some pictures. 


Chapter 7 


1.4x3 
2.3x5 
3. 


—3 0 13 
9 5 15 
13 —42 
22 9 
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—1 —10 4 
1 29 


6. We write the steps of gaussian elimination in order: 





1 O 1|1 0 0 

O 1 1/0 1 0 

2 1 0JO 0 1 
1 0 1 1 0 0 
0 1 O 1 0 
0-1 =2|=2 0 1 
1 0 1 1 0 0 
0 1 1 0 1 0 
0 0 —3|-2 -1 1 


— 
© 


1 1 0 0 
0 1 0 | —2/3 23 1/3 
0 0 -3 —2 -1 1 


0 1 Oj -2/3 2/3 1/3 
0 0 1 23 1⁄3 —1/3 


(x 1 0 0 


1 0 0 1⁄3 -1/3 1/3 
0 1 Oj -2/3 2/3 1/3 
0 0 1 2/3 13 —1⁄3 


7. The rows are linearly independent so the matrix induces a mapping of R° 
that is one-to-one and onto. So it must be invertible. 


8. The probability of heads on any given flip is 0.666... and the probability of 
tails is 0.333 . . .. This is true regardless of the history of previous flips. Thus 
the likelihood of two heads in a row is 0.444... and the likelihood of two 
tails in a row is 0.111.... 
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Chapter 8 


1. This graph (Fig. 8.1s) does not have an Euler path because, once you go to 
the end of a (vertical) leg, you cannot get back. 


Figure 8.1s A graph on five vertices without an Euler path. 


2. This graph (Fig. 8.2s) has two distinct Euler paths because the triangle can 
be traversed clockwise or counterclockwise. 


Figure 8.2s A graph on five vertices with two distinct Euler paths. 


3. We know that the Euler number for a sphere with one handle is 0. See 
Fig. 8.3s, in which V = 1, E = 2, and F = 1. If we add a second handle that 
changes the Euler number to —2. This point can be seen by examining Fig. 
8.4s in which V = 4, E = 10, and F = 4. 

4. Such a graph will have G) = 10 edges. It will have G) = 10 faces (because 
each face will be a triangle). 


5. The complete graph on k vertices has 6) edges. 
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Figure 8.3s The Euler number of the sphere with one handle. 





Figure 8.4s The Euler number of the sphere with two handles. 


6. This graph has nine edges since each of three vertices in the first row must 
be connected to each of three vertices in the second row (or vice versa). 


7. Of course there are five vertices. There are also five edges. 


8. In Fig. 8.5s, the left-hand graph illustrates the first desideratum and the right- 
hand graph illustrates the second. 


Figure 8.5s Two particular graphs. 


9. The formula is x = 2 — 2g. 
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Chapter 9 


1. (a) 
(b) 
(c) 
(d) 
(e) 
(f) 1 


2. Let m =m’ + 2k and n =n’ + 2£, where m’ and n’ are each either 0 or 1. 
Then 


YN NF CO WN 


m-n=m'-n' +2(kn' + em’ + 2ke) 
Thus 
(m - n) mod 2 = m' - n' = (m mod 2) - (n mod 2) 


3. Let m = m' + 2k and n =n’ + 22, where m’ and n’ are each either 0 or 1. 
Then 


m+n=m 4+n'42k+2) 
Thus 
(m +n) mod 2 = m' + n' = (m mod 2) + (n mod 2) 
4. The prime factorization is 111 = 3-37 and 211 is prime. The numbers 


clearly have no prime factors in common. 


5. We write 1024=2-2-2-2-2-2-2-2-2-2 and 100=2-2-5-5. 
Clearly the greatest common divisor is 2-2 = 4. 


6. Plainly the sum of two 2 x 2 matrices is another 2 x 2 matrix. Matrix addi- 
tion is associative just because ordinary addition of numbers is. The additive 


identity will be 
0 0 
0 0 
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Given a 2 x 2 matrix 


7. The 2 x 2 matrix 


1 0 
A= 
0 0 
does not have a multiplicative inverse. Hence these matrices do not form a 
group under multiplication. 


8. We check that 


(a-b)-(b - a™') = ((a - b) < b7') a 


= (a - (b -b7')) -a™' 
1 


(a-e)-a~ 
=a-a! 


=e 
A similar calculation shows that 
(b-'-a7')-(a-b) =e 


It follows then that b~! - a~! is the multiplicative inverse of ab. 


9. The polynomial p(x) = x? + 1 does not have a multiplicative inverse in the 
polynomials. So the polynomials do not form a group under multiplication. 


10. (a) 


80=5-15+5 
15=3-5+0 


So 5 is the greatest common divisor. 
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(b) 
92 = 3-244 20 
24=1-20+4 
20=5.-4 


Therefore 4 is the greatest common divisor. 


Chapter 10 


1. We use the standard transliteration A —> 0, B —> 1, and so on to transform 
the given message to the list of numbers 


124412441817384 


Now we perform the linear transformation P > P — 3, applied modulo 26. 
We obtain 


2421 12421 124614051 
Again the transliteration now yield the encrypted message 
YVBYVBYGOAFB 
2. The coded message transliterates to the sequence of numbers 
4023025 18 13 12 13 10 
The decryption algorithm, applied modulo 26, yields now the sequence 
18 1411 1413610124 
The usual transliteration converts this to 
SOLONGBABY 


Remembering that we need to insert spaces and punctuation, we finally re- 
trieve the message 


SO LONG BABY 
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3. We notice that R occurs five times in the encrypted message. Since E is the 
most commonly occurring letter in the English language, we guess that E 
has been encoded as R. Thus we guess that the shift encryption being used 
here is P +» P + 13. Thus the decryption algorithm is P œ> P — 13. We 
transliterate the encrypted message as usual to 


25 17 17 6 25 17 20 17 4 17 
Now we perform the decryption to obtain 
1244 -7 12474 -9 4 
Finally, this string of numbers transliterates to 
MEETMEHERE 
Adding spacing as usual gives the message 
MEET ME HERE 
4. We transliterate the message to 
74 11 11 14 12 24 7 14 13 4 24 
Applying the affine encryption scheme (modulo 26 as usual) gives the result 
6 23 18 18 1 215 6 1 24 23 5 
This finally becomes the encrypted message 
GXSSBVFGBYXF 
5. Under the usual transliteration, the message becomes 
17 3 16 24 15 7 25 24 3 16 24 15 
The affine decryption scheme transforms this to 


2 0 13 3 24 8 18 3 O 13 3 24 


(Notice that we have had to do some division modulo 26.) Finally, this translit- 
erates to 


CANDYISDANDY 
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Inserting spaces as usual gives the message 
CANDY IS DANDY 
6. The digraphs are 
TH IS WA SN OT TH EN DX 


Notice how we have added an X on the end so that the digraphs come out 
even. These digraphs correspond to the pairs of numbers 


(19,7) (8,18) (22,0) (18,13) (14,19) (19,7) (4,13) (4,23) 
Now, according to the algorith in the text, these pairs correspond to 
501 226 572 481 383 501 117 127 
Now we encrypt this list of numbers as 
158 9 371 98 480 158 358 478 
This tranlates into the roman alphabet as 
GC AJ OH DU SM GC NU SK 
In other words, our encrypted message is 
GCAJOHDUSMGCNUSK 
7. The standard transliteration of the given message is 
13 14 22 8 18 1974 19 8 12 4 
Application of the encryption algorithm then yields 


0 6 24 1604 18 64 16 4 6 


Solutions to Exercises 345 


Chapter 11 


1. Imitating the example in the text, this reduces to 


[a x b] + [b x T] + [a x T] 


2. We write 


a x (a+b)= (a xa)+ (axb) 
=a + (a xb) 
But if we remember that a x b is the intersection of a and b, then a x b is 


a subset of a. So this last line must be (remembering that + is union) just a 
itself. 


. This is just a boolean rendition of the familiar fact 


(AUB) =°AN‘B 


6. Similar to Sol. 2. 


7. Interpret in the language of intersection and union, and then the assertion is 


clear. 


Chapter 12 


= = 
OO RO OO SY ON ORO 


0 
0 


. Fore > 0, let j > l/e —7. 
. Fore > 0, let j > log, (1/e). 
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Chapter 13 


. 5, 545/2, 5+ 5/2 + 5/4, 5+ 5/2 + 5/4 + 5/8 
. 1/3, 1/3 + 1/9, 1/3 + 1/9 + 1/27, 1/3 + 1/9 + 1/274 1/81 
. Diverges 
. Converges to 1 
. The terms, for j > 10, are smaller than 2/7 2. So the series converges. 
. The terms are smaller in absolute value than 2~/. So the series converges. 
. The sum is 8/7. 
. The sum is 7/4. 
37-1 


37 _ 36° 
110 — 1 


O ANN MN BP WNY 


. The sum is 





. The sum is 114 - 


— 
© 





es ee 


p< 
pà 


. Converges 


= 
N 


. Diverges 


= 
W 


. The ratio is 1/(j + 1), which tends to 0. So the series converges. 


p< 
N 


. The root is 3j/(j + 1), which tends to 3. So the series diverges. 
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